Python’s stranglehold on data science pushed Neo4j to the cloud • The Register

The Neo4j graph database specialist has launched the graphing analytics workspace as a fully managed cloud service.

It’s called AuraDS, and it’s aimed directly at data scientists hoping to use the features of Neo4j’s graphing database and machine learning library, promising to free data scientists from the dreary tasks of creating databases and extending the appeal of graph databases, according to Karl. Olofson, IDC Vice President for Research.

Neo4j launched a workspace of native graph analytics, graph database, scalable graph algorithms, machine learning libraries, and graph visualization, all in one environment in April 2020.

Later that year, she added Chart Weddings to her repertoire.

With the latest release, the Data Science Toolkit comes as a fully managed cloud service with a native client for Python, which is popular among data scientists.

said Alicia Frame, Director of Graph Data Science at Neo4j log: “Data scientists really love Python. Unlike developer projects, there is no diversity. Everyone uses Python.”

Although it technically supports Python, Neo4j built its data science workspace using Cypher – the SQL-like query language – as the main interface.

Frame said that while data scientists loved the capabilities of the graph, they really struggled. “They said, ‘Do you want me to learn a new language?’ I don’t understand your drivers. It’s hard.”

“We’ve learned how powerful Python is, and how almost any friction will discourage a new user.”

In addition to native support for Python, Neo4j AuraDS includes access to more than 65 graph algorithms in a single workspace as well as in-graph ML models to reduce data science burden.

Frame said Neo4j has built firewalls into the product so users don’t inadvertently break the backend. Now if you try to run an algorithm for which you don’t have enough memory, instead of letting you do that, you get a message saying, ‘Hey, you’re trying to run the graph, and it requires that much memory. “

I have no doubt that there will be resistance at first from the handicraft owners

Users are given the choice to either kill a process, resize the server instance, or risk their own, she said.

Graph databases are suitable for high-dimensional problems, and as such were applied in their early days for understanding social networks. Financial risk management and fraud detection are becoming other common use cases while the concept is gaining traction in chemistry, biology, and drug research.

“The most important thing to think about is the connected data — it is by definition very high dimensional. There is a lot of information encoded in those relationships.

So, the typical data science workflow: I have a data frame; God forbid, I have a spreadsheet; I have an array; and I don’t really care how things relate to each other. I treat everything as a uniquely distributed individual.

“With a graph, you go from a table to a huge adjacency matrix, where I suddenly have to think about each data point. What other data points does it connect to? Then what do its connections connect to? And that’s how you get,” said Frame, who has worked with graph databases. Graphic in Microbiology and Genetics before joining the tech industry, “Very high dimensions really really fast”.

IDC’s Olofson said that the Neo4j Graph DBMS is designed to handle complex problems and is primarily implemented in a scaling fashion, which means the user has to anticipate the most demanding query and scale their system to support it.

“AuraDS enables dynamic scaling, so the problem is out of the user’s hands. This is the most natural way to deploy Neo4j,” he said.

Users are set to be impressed by its ease of use, Olofson said, and the ability to allocate resources dynamically. “Configuring a graph database can be challenging, especially when it comes to data science, so AuraDS frees users from that effort so they can focus on analytics.”

The advantages, he said, mean that most users will mistake a managed cloud service.

“I have no doubt that there will be resistance at first from people doing it yourself, but it is clear that the overall benefits outweigh the subtle advantages that can be gained through explicit physical deployment, most of which will certainly move to the cloud, except in special cases Rare are cases where certain system requirements require manual setup.”

AuraDS is available on Google Cloud. It is scheduled to launch on AWS later this year, and Azure shortly after. ®

Leave a Comment