Machine Learning in Astronomy

Machine learning in astronomy

Is astronomy data science?

Machine learning in astronomy - Sure it sounds like an oxymoron, but is that the real case? Machine learning is one of the newest 'sciences', while astronomy is the oldest. In fact, astronomy developed naturally because people realized that studying the stars was not only fascinating, but it also helped them in their daily lives. For example, research into the star cycle helped create calendars (such as the Maya and the Proto-Bulgarian calendar). Also, it played an important role in navigation and orientation.

Of particular importance was the early development of observational analysis using mathematical, geometric, and other scientific methods. It originated with the Babylonians, who laid the foundations for the tradition of astronomers, which will continue in many other civilizations. Since then, data analysis has played a central role in astronomy.

So, after millennia of sophisticated techniques for data analysis, you think any dataset can now present a problem to astronomers, right?

Well ... that's not entirely true. The main problem that astronomers are facing now ... it may seem strange ... advances in technology.

Wait, what ?! How can good technology be a problem? It can most certainly do. Because what I mean by good technology is the large field of view (FOV) of telescopes and the high resolution of detectors. Combining those elements indicates that today's telescopes collect large amounts of data more than previous-generation tech. And it suggests that astronomers must refute the amount of data they have never seen before.

How was the Galaxy Zoo project born?

In 2007, Kevin Schwansky found himself in a similar situation.

As an astronomer at Oxford University, one of his tasks was to classify images of 900,000 galaxies collected by the Sloan Digital Sky Survey over 7 years. He had to look at every single image to see if the galaxy was elliptical or curved and if it was rotating. The action seems like a pretty trivial one. However, the large amount of data made it almost impossible. Why Because it is estimated that a person has to do 2 complete / 7 tasks to complete it in 5-5 years! Talking about a heavy workload! So, after working for a week, Swisswinski and his colleague Chris Lintot decided that there was something better to be done.

The Galaxy Zoo - a civic science project - was born. If this is the first time you've heard of it, civic science means people participate in professional scientific research. In general, Schainsky and Lintot's idea is to distribute images online and hire volunteers to help label the galaxy. And this is possible because the function of identifying a galaxy as a galaxy or spherical is quite straightforward.

Initially, they hoped to contribute 20,000,000,000.

To their surprise, however, more than 1,150,000 people volunteered for the project and the images were categorized in about 2 years. The Galaxy Zoo was a success and followed more projects, such as the Galaxy Zoo Supernova and the Galaxy Zoo Hubble. In fact, to this day there are many active projects.

The use of thousands of volunteers to analyze the data may seem like a success but it shows how much we are suffering right now. In a space of 2 years, 100,000 people were not able to classify (and perform complex analysis in) data collected from just one telescope! And now we're building a hundred, even thousand times more powerful telescopes. That said, in a few years' volunteers are not enough to analyze the huge data we have received.

To prove this, the rule of thumb in astronomy is that the information we collect doubles every year. For example, the Hubble Telescope has been collecting 20 GB of data every week since 1990. And by early 2020, the Large Synoptic Survey Telescope (LST) expects to collect more te0 terabytes of data each night.

But that is nothing compared to the most ambitious project in astronomy - the square kilometer array (SKA). SKA is an official radio telescope that is expected to be completed in Australia and South Africa by 20224. It is expected to produce more than 1 bite per day with 2,000 radio dishes and 2 million low-frequency antennas. This is more than the entire internet for a year, produced in just one day!

Wow, can you imagine !?

With that in mind, it is clear that this monstrous amount of data will not be analyzed by online volunteers. Therefore, researchers are now recruiting a variety of ally-machines.

Why is everyone talking about machine learning?

Big data, machines, new knowledge ... you know where we're going, don't we?

Machine learning.

Well, it turns out that machine learning in astronomy is one thing too. Why

First of all, machine learning can process data faster than other technologies. But it can also analyze data without our instructions on how to do it. This is extremely important because machine learning understands things that we don't know how to do and can't recognize unexpected patterns. For example, it can distinguish different types of galaxies we know they exist.

This leads us to believe that machine learning is also less biased than we humans, and therefore more reliable in its analysis. For example, we might think that there are different types of galaxies, but on a machine, they can look very distinct. And that certainly improves our humble understanding of the universe.

No matter how complicated these issues may be, the real power of machine learning is not limited to solving classification issues. In fact, it has a wide range of applications that can lead to problems that we previously considered unnecessary.

What is gravity lensing?

In 2017, a Stanford University research team demonstrated the effectiveness of machine learning algorithms using neural networks to study images of gravitational lenses.

Gravitational lensing is an effect where a large gravitational field around large objects (e.g. a group of galaxies) can become light and produce distorted images. This is a major prediction of Einstein's general theory of relativity. That’s all well and good, but you might be wondering, why is it useful to study the effects?

Gravitational lensing, machine learning in astronomy

Well, you have to understand that regularity is not the only source of gravity. Scientists have proposed that there is an "invisible thing" and that it is also called dark matter, which is the largest part of the universe. Although we are unable to observe it directly (hence the name) and gravitational lensing is a way to "feel" its effect and confirm it.

Previously, this type of analysis was a difficult process involving comparing real images of lenses with a large number of computer simulations of mathematical lensing models. This can take weeks to months for a single lens. Now, this is what I call an incompetent method.

But with the help of the Neural Network, researchers were able to analyze in a matter of seconds (and, in theory, on a cell phone's microchip), which they performed using real images from NASA's Hubble Space Telescope. That is definitely impressive!

Overall, the ability to shift through large amounts of data and analyze in a very fast and fully automated fashion can transform astrophysics in the way that is needed for future sky surveys. And they will look deeper into the universe and produce more data than ever before.

What is the current use of machine learning?

Now that we know how powerful machine learning can be, it is inevitable to ask ourselves: Is machine learning already deployed in astronomy for something useful?

The answer is kind of ... The truth is that the use of machine learning in astronomy is a very novel technique. Although astronomers have been using computational techniques such as simulations to aid in research, ML is a different kind of animal.

Still, there are some examples of using ML in real life.

Let's start with the easiest. Images from telescopes often contain "sound." What we consider to be noise is an irregular fluctuation that is not related to observation. For example, the structure of the wind and the atmosphere can affect the image created by a telescope on Earth as the wind travels along the way. That's why we send some telescopes into space to remove the effects of the Earth's atmosphere. But how can you eliminate the noise produced by these factors? The machine learning algorithm is called a generative adverbial network or GAN.

GANS consists of two elements - a neural network that seeks to generate objects and another ("discriminator") that tries to guess whether the object is real or counterfeit-generated. This is a very common and successful technique of noise removal, already dominating the self-driving car industry. In astronomy, an image needs to be as clear as possible. Therefore, this technology is being widely used.

Another example of AI comes from NASA.

Although at the moment it has no space applications. I'm talking about finding forest fires and floods. NASA has trained machines to detect smoke from wildfires using satellite images. Round? To modify hundreds of small satellites, including machine-learning algorithms embedded within all sensors. With such capabilities, sensors can identify wildfires and send data back to Earth in real-time, providing firefighters and others with up-to-date information that could dramatically improve.

Anything else?

Yes - in NASA's research landing on important applications of machine learning. One of the techniques for space exploration is to send probes to land on asteroids, gather material, and send them back to Earth. Currently, to select the appropriate landing site, the probes must take a picture of the asteroid from each angle, send it back to Earth, then scientists manually analyze the images and give investigative instructions on what to do.

This detailed process is not only complicated but also limited for many reasons. First of all, there is a real demand for people working on this project. Most importantly, you have to keep in mind that these probes can be very far from home. Therefore, signals carrying commands may take minutes or hours to reach, making it impossible to fine-tune. So NASA is trying to cut this "information navel" and enable the probe to identify the asteroid's 3D structure and choose its own landing site. And the way to achieve this is by using neural networks.

What are the barriers and limitations of machine learning in astronomy?

If machine learning is so powerful why take so long to use it?

Well, one reason is that you need a lot of labeled and processed data to train the machine learning algorithm. Until recently, there was not enough data on some foreign astronomical phenomena for computer studies.

It should also be noted that neural networks are a kind of black box - we don’t have a deep understanding of how they work and how things make sense. Therefore, scientists are afraid to fully understand how they work and use the tools.

While we at 365 are very excited about all the ML development in data science, we should note that it comes with some limitations.

Many assume that neural networks have very high accuracy and little to no bias. While this may be generally true, researchers need to understand that feed inputs (or training data) to algorithms can negatively affect results. Learning from the AI training set. Therefore, any biases, intentionally or unintentionally inserted into the initial data, may remain in the algorithm.

For example, if we assume that there are only three types of galaxies, a supervised education algorithm believes that there are only 3 types of galaxies.

Therefore, although the computer itself did not add additional bias, it still reflects our own. That is, we can teach computers to think in a biased way. It also follows that ML may not be able to identify some revolutionary new models.

Those factors do not change the game. However, scientists using this tool should keep these in mind.

So what happens to machine learning?

The data we produce shapes the world we live in. Therefore, we need to introduce data processing techniques (such as machine learning) in every aspect of science. If more researchers start using machine learning there will be a similar demand for graduates with experience. Machine education is still a hot topic today but in the future, it is only going to increase. And we still don't see what milestones we will achieve using Annie and ML and how they will transform our lives.