Skip to content

Add support for Memgraph and supply chain notebook #522

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 3 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions ChangeLog.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,7 @@ Starting with v1.31.6, this file will contain a record of major features and upd
## Upcoming
- Added `--explain-type` option to `%%gremlin` ([Link to PR](https://github.com/aws/graph-notebook/pull/503))
- Fixed kernel crashing with ZMQ errors on magic execution ([Link to PR](https://github.com/aws/graph-notebook/pull/517))
- Added Memgraph as an additional graph database and the supply chain analysis notebook ([Link to PR](https://github.com/aws/graph-notebook/pull/522))

## Release 3.8.2 (June 5, 2023)
- New Sample Applications - Healthcare and Life Sciences notebooks ([Link to PR](https://github.com/aws/graph-notebook/pull/484))
Expand Down
27 changes: 27 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -24,6 +24,7 @@ Instructions for connecting to the following graph databases:
| [Blazegraph](#blazegraph) | RDF | SPARQL |
|[Amazon Neptune](#amazon-neptune)| property graph or RDF | Gremlin or SPARQL |
| [Neo4J](#neo4j) | property graph | Cypher |
| [Memgraph](#memgraph) | property graph | Cypher |

We encourage others to contribute configurations they find useful. There is an [`additional-databases`](https://github.com/aws/graph-notebook/blob/main/additional-databases) folder where more information can be found.

Expand Down Expand Up @@ -192,6 +193,7 @@ Configuration options can be set using the `%graph_notebook_config` magic comman
| sparql | SPARQL connection object | ``` { "path": "sparql" } ``` | string |
| gremlin | Gremlin connection object | ``` { "username": "", "password": "", "traversal_source": "g", "message_serializer": "graphsonv3" } ```| string |
| neo4j | Neo4J connection object |``` { "username": "neo4j", "password": "password", "auth": true, "database": null } ``` | string |
| memgraph | Memgraph connection object |``` { "username": "", "password": "", "auth": false, "database": "memgraph" } ``` | string |

### Gremlin Server

Expand Down Expand Up @@ -345,6 +347,31 @@ Ensure that you also specify the `%%oc bolt` option when submitting queries to t

To setup a new local Neo4J Desktop database for use with the graph notebook, check out the [Neo4J Desktop User Interface Guide](https://neo4j.com/developer/neo4j-desktop/).

### Memgraph

Change the configuration using `%%graph_notebook_config` and modify the fields for `host` and `port`, `ssl`.

After local setup of Memgraph is complete, set the following configuration to connect from graph-notebook:

```
%%graph_notebook_config
{
"host": "localhost",
"port": 7687,
"ssl": false
}
```

Ensure that you specify the `%%oc bolt` option when submitting queries to the Bolt endpoint. For example, a correct way of running a Cypher query via Bolt protocol is:

```
%%oc bolt
MATCH (n)
RETURN count(n)
```

For more details on how to run Memgraph, refer to its [notebook guide](./additional-databases/memgraph/README.md).

## Building From Source

A pre-release distribution can be built from the graph-notebook repository via the following steps:
Expand Down
49 changes: 49 additions & 0 deletions additional-databases/memgraph/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,49 @@
## Connecting graph notebook to Memgraph Bolt Endpoint

[Memgraph](https://memgraph.com/) is an open-source in-memory graph database built for highly performant and advanced analytical insights. Memgraph is Neo4J Bolt protocol compatible and it uses the standardized Cypher query language.

For a quick start, run the following command in your terminal to start Memgraph Platform in a Docker container:

```
docker run -it -p 7687:7687 -p 7444:7444 -p 3000:3000 -e MEMGRAPH="--bolt-server-name-for-init=Neo4j/" memgraph/memgraph-platform
```

The above command started Memgraph database, MAGE (graph algorithms library) and Memgraph Lab (visual user interface). For additional instructions on setting up and running Memgraph locally, refer to the [Memgraph documentation](https://memgraph.com/docs/memgraph/installation). Connection to the graph notebook works if the `--bolt-server-name-for-init` setting is modified. For more information on changing configuration settings, refer to our [how-to guide](https://memgraph.com/docs/memgraph/how-to-guides/config-logs).


After local setup of Memgraph is complete, set the following configuration to connect from graph-notebook:

```
%%graph_notebook_config
{
"host": "localhost",
"port": 7687,
"ssl": false
}
```

If you set up an authentication on your Memgraph instance, you can provide login details via configuration. For example, if you created user `username` identified by `password`, then the following configuration is the correct one:

%%graph_notebook_config
{
"host": "localhost",
"port": 7687,
"ssl": false,
"memgraph": {
"username": "username",
"password": "password",
"auth": true
}
}

To learn how to manage users in Memgraph, refer to [Memgraph documentation](https://memgraph.com/docs/memgraph/reference-guide/users).

You can query Memgraph via Bolt protocol which was designed for efficient communication with graph databases. Memgraph supports versions 1 and 4 of the protocol. Ensure that you specify the `%%oc bolt` option when submitting queries to the Bolt endpoint. For example, a correct way of running a Cypher query via Bolt protocol is:

```
%%oc bolt
MATCH (n)
RETURN count(n)
```

Another way of ensuring that Memgraph is running, head to `localhost:3000` and check out Memgraph Lab, a visual user interface. You can see node and relationship count there, explore, query and visualize data. If you get stuck and have more questions, [let's talk at Memgraph Discord community](https://www.discord.gg/memgraph).
52 changes: 48 additions & 4 deletions src/graph_notebook/configuration/generate_config.py
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,7 @@

from graph_notebook.neptune.client import SPARQL_ACTION, DEFAULT_PORT, DEFAULT_REGION, DEFAULT_GREMLIN_SERIALIZER, \
DEFAULT_GREMLIN_TRAVERSAL_SOURCE, DEFAULT_NEO4J_USERNAME, DEFAULT_NEO4J_PASSWORD, DEFAULT_NEO4J_DATABASE, \
DEFAULT_MEMGRAPH_USERNAME, DEFAULT_MEMGRAPH_PASSWORD, DEFAULT_MEMGRAPH_DATABASE, \
NEPTUNE_CONFIG_HOST_IDENTIFIERS, is_allowed_neptune_host, false_str_variants, \
GRAPHSONV3_VARIANTS, GRAPHSONV2_VARIANTS, GRAPHBINARYV1_VARIANTS

Expand Down Expand Up @@ -115,13 +116,42 @@ def to_dict(self):
return self.__dict__


class MemgraphSection(object):
"""
Used for Memgraph-specific settings in a notebook's configuration
"""

def __init__(self, username: str = "", password: str = "", auth: bool = False, database: str = ""):
"""
:param username: login user for the Memgraph endpoint
:param password: login password for the Memgraph endpoint
:param auth: authentication switch for the Memgraph endpoint
:param database: database used at Memgraph endpoint
"""

if username == "":
username = DEFAULT_MEMGRAPH_USERNAME
if password == "":
password = DEFAULT_MEMGRAPH_PASSWORD
if database == "":
database = DEFAULT_MEMGRAPH_DATABASE

self.username = username
self.password = password
self.auth = True if auth in [True, "True", "true", "TRUE"] else False
self.database = database

def to_dict(self):
return self.__dict__

class Configuration(object):
def __init__(self, host: str, port: int,
auth_mode: AuthModeEnum = DEFAULT_AUTH_MODE,
load_from_s3_arn='', ssl: bool = True, ssl_verify: bool = True, aws_region: str = DEFAULT_REGION,
proxy_host: str = '', proxy_port: int = DEFAULT_PORT,
sparql_section: SparqlSection = None, gremlin_section: GremlinSection = None,
neo4j_section: Neo4JSection = None,
memgraph_section: MemgraphSection = None,
neptune_hosts: list = NEPTUNE_CONFIG_HOST_IDENTIFIERS):
self._host = host.strip()
self.port = port
Expand All @@ -140,10 +170,12 @@ def __init__(self, host: str, port: int,
self.aws_region = aws_region
self.gremlin = GremlinSection()
self.neo4j = Neo4JSection()
self.memgraph = MemgraphSection()
else:
self.is_neptune_config = False
self.gremlin = gremlin_section if gremlin_section is not None else GremlinSection()
self.neo4j = neo4j_section if neo4j_section is not None else Neo4JSection()
self.memgraph = memgraph_section if memgraph_section is not None else MemgraphSection()

@property
def host(self):
Expand Down Expand Up @@ -175,7 +207,8 @@ def to_dict(self) -> dict:
'aws_region': self.aws_region,
'sparql': self.sparql.to_dict(),
'gremlin': self.gremlin.to_dict(),
'neo4j': self.neo4j.to_dict()
'neo4j': self.neo4j.to_dict(),
'memgraph': self.memgraph.to_dict()
}
else:
return {
Expand All @@ -187,7 +220,8 @@ def to_dict(self) -> dict:
'ssl_verify': self.ssl_verify,
'sparql': self.sparql.to_dict(),
'gremlin': self.gremlin.to_dict(),
'neo4j': self.neo4j.to_dict()
'neo4j': self.neo4j.to_dict(),
'memgraph': self.memgraph.to_dict()
}

def write_to_file(self, file_path=DEFAULT_CONFIG_LOCATION):
Expand All @@ -202,11 +236,11 @@ def generate_config(host, port, auth_mode: AuthModeEnum = AuthModeEnum.DEFAULT,
ssl_verify: bool = True, load_from_s3_arn='',
aws_region: str = DEFAULT_REGION, proxy_host: str = '', proxy_port: int = DEFAULT_PORT,
sparql_section: SparqlSection = SparqlSection(), gremlin_section: GremlinSection = GremlinSection(),
neo4j_section=Neo4JSection(), neptune_hosts: list = NEPTUNE_CONFIG_HOST_IDENTIFIERS):
neo4j_section=Neo4JSection(), memgraph_section=MemgraphSection(), neptune_hosts: list = NEPTUNE_CONFIG_HOST_IDENTIFIERS):
use_ssl = False if ssl in false_str_variants else True
verify_ssl = False if ssl_verify in false_str_variants else True
c = Configuration(host, port, auth_mode, load_from_s3_arn, use_ssl, verify_ssl, aws_region, proxy_host, proxy_port,
sparql_section, gremlin_section, neo4j_section, neptune_hosts)
sparql_section, gremlin_section, neo4j_section, memgraph_section, neptune_hosts)
return c


Expand Down Expand Up @@ -256,6 +290,14 @@ def generate_default_config():
default=True)
parser.add_argument("--neo4j_database", help="the name of the database to use for Neo4J",
default=DEFAULT_NEO4J_DATABASE)
parser.add_argument("--memgraph_username", help="the username to use for Memgraph connections",
default=DEFAULT_MEMGRAPH_USERNAME)
parser.add_argument("--memgraph_password", help="the password to use for Memgraph connections",
default=DEFAULT_MEMGRAPH_PASSWORD)
parser.add_argument("--memgraph_auth", help="whether to use auth for Memgraph connections or not [True|False]",
default=True)
parser.add_argument("--memgraph_database", help="the name of the database to use for Memgraph",
default=DEFAULT_MEMGRAPH_DATABASE)
args = parser.parse_args()

auth_mode_arg = args.auth_mode if args.auth_mode != '' else AuthModeEnum.DEFAULT.value
Expand All @@ -266,6 +308,8 @@ def generate_default_config():
args.gremlin_password, args.gremlin_serializer),
Neo4JSection(args.neo4j_username, args.neo4j_password,
args.neo4j_auth, args.neo4j_database),
MemgraphSection(args.memgraph_username, args.memgraph_password,
args.memgraph_auth, args.memgraph_database),
args.neptune_hosts)
config.write_to_file(args.config_destination)

Expand Down
19 changes: 15 additions & 4 deletions src/graph_notebook/configuration/get_config.py
Original file line number Diff line number Diff line change
Expand Up @@ -6,9 +6,10 @@
import json

from graph_notebook.configuration.generate_config import DEFAULT_CONFIG_LOCATION, Configuration, AuthModeEnum, \
SparqlSection, GremlinSection, Neo4JSection
SparqlSection, GremlinSection, Neo4JSection, MemgraphSection
from graph_notebook.neptune.client import NEPTUNE_CONFIG_HOST_IDENTIFIERS, is_allowed_neptune_host, false_str_variants, \
DEFAULT_NEO4J_USERNAME, DEFAULT_NEO4J_PASSWORD, DEFAULT_NEO4J_DATABASE
DEFAULT_NEO4J_USERNAME, DEFAULT_NEO4J_PASSWORD, DEFAULT_NEO4J_DATABASE, DEFAULT_MEMGRAPH_USERNAME, DEFAULT_MEMGRAPH_PASSWORD, \
DEFAULT_MEMGRAPH_DATABASE

neptune_params = ['auth_mode', 'load_from_s3_arn', 'aws_region']

Expand All @@ -21,6 +22,7 @@ def get_config_from_dict(data: dict, neptune_hosts: list = NEPTUNE_CONFIG_HOST_I
sparql_section = SparqlSection(**data['sparql']) if 'sparql' in data else SparqlSection('')
gremlin_section = GremlinSection(**data['gremlin']) if 'gremlin' in data else GremlinSection()
neo4j_section = Neo4JSection(**data['neo4j']) if 'neo4j' in data else Neo4JSection('', '', True, '')
memgraph_section = (MemgraphSection(**data["memgraph"]) if "memgraph" in data else MemgraphSection("", "", False, ""))
proxy_host = str(data['proxy_host']) if 'proxy_host' in data else ''
proxy_port = int(data['proxy_port']) if 'proxy_port' in data else 8182

Expand All @@ -34,10 +36,19 @@ def get_config_from_dict(data: dict, neptune_hosts: list = NEPTUNE_CONFIG_HOST_I
print('Ignoring Neo4J custom authentication, Amazon Neptune does not support this functionality.\n')
if neo4j_section.to_dict()['database'] != DEFAULT_NEO4J_DATABASE:
print('Ignoring Neo4J custom database, Amazon Neptune does not support multiple databases.\n')
if memgraph_section.to_dict()["username"] != DEFAULT_MEMGRAPH_USERNAME \
or memgraph_section.to_dict()["password"] != DEFAULT_MEMGRAPH_PASSWORD:
print(
"Ignoring Memgraph custom authentication, Amazon Neptune does not support this functionality.\n"
)
if memgraph_section.to_dict()["database"] != DEFAULT_MEMGRAPH_DATABASE:
print(
"Ignoring Memgraph custom database, Amazon Neptune does not support multiple databases.\n"
)
config = Configuration(host=data['host'], port=data['port'], auth_mode=AuthModeEnum(data['auth_mode']),
ssl=data['ssl'], ssl_verify=ssl_verify, load_from_s3_arn=data['load_from_s3_arn'],
aws_region=data['aws_region'], sparql_section=sparql_section,
gremlin_section=gremlin_section, neo4j_section=neo4j_section,
gremlin_section=gremlin_section, neo4j_section=neo4j_section, memgraph_section=memgraph_section,
proxy_host=proxy_host, proxy_port=proxy_port, neptune_hosts=neptune_hosts)
else:
excluded_params = []
Expand All @@ -50,7 +61,7 @@ def get_config_from_dict(data: dict, neptune_hosts: list = NEPTUNE_CONFIG_HOST_I

config = Configuration(host=data['host'], port=data['port'], ssl=data['ssl'], ssl_verify=ssl_verify,
sparql_section=sparql_section, gremlin_section=gremlin_section, neo4j_section=neo4j_section,
proxy_host=proxy_host, proxy_port=proxy_port)
memgraph_section=memgraph_section, proxy_host=proxy_host, proxy_port=proxy_port)
return config


Expand Down
4 changes: 3 additions & 1 deletion src/graph_notebook/magics/graph_magic.py
Original file line number Diff line number Diff line change
Expand Up @@ -320,7 +320,9 @@ def _generate_client_from_config(self, config: Configuration):
.with_gremlin_login(config.gremlin.username, config.gremlin.password) \
.with_gremlin_serializer(config.gremlin.message_serializer) \
.with_neo4j_login(config.neo4j.username, config.neo4j.password, config.neo4j.auth,
config.neo4j.database)
config.neo4j.database) \
.with_memgraph_login(config.memgraph.username, config.memgraph.password, config.memgraph.auth,
config.memgraph.database)

self.client = builder.build()

Expand Down
23 changes: 21 additions & 2 deletions src/graph_notebook/neptune/client.py
Original file line number Diff line number Diff line change
Expand Up @@ -37,6 +37,9 @@
DEFAULT_NEO4J_USERNAME = 'neo4j'
DEFAULT_NEO4J_PASSWORD = 'password'
DEFAULT_NEO4J_DATABASE = DEFAULT_DATABASE
DEFAULT_MEMGRAPH_USERNAME = ""
DEFAULT_MEMGRAPH_PASSWORD = ""
DEFAULT_MEMGRAPH_DATABASE = "memgraph"

NEPTUNE_SERVICE_NAME = 'neptune-db'
logger = logging.getLogger('client')
Expand Down Expand Up @@ -143,6 +146,8 @@ def __init__(self, host: str, port: int = DEFAULT_PORT, ssl: bool = True, ssl_ve
gremlin_serializer: str = DEFAULT_GREMLIN_SERIALIZER,
neo4j_username: str = DEFAULT_NEO4J_USERNAME, neo4j_password: str = DEFAULT_NEO4J_PASSWORD,
neo4j_auth: bool = True, neo4j_database: str = DEFAULT_NEO4J_DATABASE,
memgraph_username: str = DEFAULT_MEMGRAPH_USERNAME, memgraph_password: str = DEFAULT_MEMGRAPH_PASSWORD,
memgraph_auth: bool = False, memgraph_database: str = DEFAULT_MEMGRAPH_DATABASE,
auth=None, session: Session = None,
proxy_host: str = '', proxy_port: int = DEFAULT_PORT,
neptune_hosts: list = None):
Expand All @@ -161,6 +166,10 @@ def __init__(self, host: str, port: int = DEFAULT_PORT, ssl: bool = True, ssl_ve
self.neo4j_password = neo4j_password
self.neo4j_auth = neo4j_auth
self.neo4j_database = neo4j_database
self.memgraph_username = memgraph_username
self.memgraph_password = memgraph_password
self.memgraph_auth = memgraph_auth
self.memgraph_database = memgraph_database
self.region = region
self._auth = auth
self._session = session
Expand Down Expand Up @@ -370,6 +379,7 @@ def opencypher_http(self, query: str, headers: dict = None, explain: str = None,
res = self._http_session.send(req, verify=self.ssl_verify)
return res

# TODO Check this for Memgraph + typo on Cypher
def opencyper_bolt(self, query: str, **kwargs):
driver = self.get_opencypher_driver()
with driver.session(database=self.neo4j_database) as session:
Expand Down Expand Up @@ -417,11 +427,13 @@ def get_opencypher_driver(self):
password = DEFAULT_NEO4J_PASSWORD
auth_final = (user, password)
else:
if self.neo4j_auth:
# user changed default Memgraph auth to True
if self.memgraph_auth:
auth_final = (self.memgraph_username, self.memgraph_password)
elif self.neo4j_auth:
auth_final = (self.neo4j_username, self.neo4j_password)
else:
auth_final = None

driver = GraphDatabase.driver(url, auth=auth_final, encrypted=self.ssl)
return driver

Expand Down Expand Up @@ -865,6 +877,13 @@ def with_neo4j_login(self, username: str, password: str, auth: bool, database: s
self.args['neo4j_database'] = database
return ClientBuilder(self.args)

def with_memgraph_login(self, username: str, password: str, auth: bool, database: str):
self.args["memgraph_username"] = username
self.args["memgraph_password"] = password
self.args["memgraph_auth"] = auth
self.args["memgraph_database"] = database
return ClientBuilder(self.args)

def with_tls(self, tls: bool):
self.args['ssl'] = tls
return ClientBuilder(self.args)
Expand Down
Loading