Interacting with the knowledge graph

Knowledge graph plays a key role in motleycrew. It is used to store the state that is used to dispatch workers, plus any other state you wish to store and query as part of your application.

We are currently using Kùzu as the knowledge graph backend, because it’s embeddable, supports openCypher and is available under the MIT license, and also has LlamaIndex integration; please let us know if you would like to use another backend.

To make interaction with Kùzu from Python more natural, we have written a thin OGM (Object-graph mapping) layer on top of Kùzu; it also allows you to do an arbitrary Cypher query to Kùzu if its abstractions don’t fit your purpose.

First, let’s create a database and a graph store.

[1]:
import kuzu
from motleycrew.storage import MotleyKuzuGraphStore

database = kuzu.Database("example_db")
graph_store = MotleyKuzuGraphStore(database)
If you are using motleycrew, chances are you have a MotleyCrew instance, which already has a graph store. It can be accessed via graph_store attribute.
Tasks registered with a crew also have a graph_store attribute, which links to the one in the crew.

Graph nodes are represented as Pydantic classes inheriting from MotleyGraphNode.

[2]:
from typing import Optional
from motleycrew.storage import MotleyGraphNode

class Person(MotleyGraphNode):
    name: str
    age: int
    occupation: Optional[str] = None

Let’s insert 2 nodes into the graph.

[3]:
john = Person(name="John", age=25, occupation="Data Scientist")
jane = Person(name="Jane", age=30, occupation="Software Engineer")

graph_store.insert_node(john)
graph_store.insert_node(jane)

john.is_inserted
[3]:
True

When a node is created and inserted into the graph, it becomes tied to the graph store. This means that any changes made to the node object will be reflected in the graph store.

[4]:
john.age += 1  # this change is instantly saved to the database

Now let’s create a relation indicating Jane is John’s manager.

[5]:
graph_store.create_relation(from_node=jane, to_node=john, label="manages")
The biggest advantage of using the graph store is the ability to query the data in a more flexible way using Cypher query language.
You can find a great manual on Cypher in Kuzu docs: https://docs.kuzudb.com/cypher/.

Let’s find all people that are managed by Jane.

[6]:
graph_store.run_cypher_query("MATCH (m:Person)-[:manages]->(p:Person) WHERE m.name = 'Jane' RETURN p")
[6]:
[[{'_id': {'offset': 1, 'table': 2},
   '_label': 'Person',
   'id': 1,
   'name': 'John',
   'age': 26,
   'occupation': 'Data Scientist'}]]

Often you would like to get the query results as objects. You can do this by providing a container argument to run_cypher_query.

[7]:
graph_store.run_cypher_query("MATCH (m:Person)-[:manages]->(p:Person) WHERE m.name = 'Jane' RETURN p", container=Person)
[7]:
[Person(name='John', age=26, occupation='Data Scientist')]

In Cypher, an object kind is represented as a label. You can see them in the query after the :s. By default, motleycrew uses the class name as the label. You can override this behavior by setting the __label__ attribute in the node class.

You can get the label by calling the get_label method on either the node class or an instance of it. The inserted nodes also have an id attribute that identifies them among the nodes of the same label.

So, in a real application, our query would probably look like this:

[8]:
label = Person.get_label()  # john.get_label() would also work
query = f"MATCH (m:{label})-[:manages]->(p:{label}) WHERE m.id = $manager_id RETURN p"

graph_store.run_cypher_query(query, parameters={"manager_id": jane.id}, container=Person)
[8]:
[Person(name='John', age=26, occupation='Data Scientist')]

The graph store has an upsert_triplet method that is useful for creating nodes and relations in one go.

Let’s create a new subordinate for Jane.

[9]:
joe = Person(name="Joe", age=35, occupation="Software Engineer")
graph_store.upsert_triplet(from_node=jane, to_node=joe, label="manages")

graph_store.run_cypher_query(query, parameters={"manager_id": jane.id}, container=Person)
[9]:
[Person(name='John', age=26, occupation='Data Scientist'),
 Person(name='Joe', age=35, occupation='Software Engineer')]

check_node_exists and check_relation_exists methods can be used to check if a node or a relation exists in the graph store.

[10]:
print(graph_store.check_node_exists(john))
print(graph_store.check_relation_exists(jane, john))
print(graph_store.check_relation_exists(jane, john, "manages"))
print(graph_store.check_relation_exists(john, jane))
True
True
True
False

You can also use the get_node_by_class_and_id method for retrieving nodes.

[11]:
graph_store.get_node_by_class_and_id(Person, john.id)
[11]:
Person(name='John', age=26, occupation='Data Scientist')

Finally, let’s delete a node. This will also delete all relations that node is a part of.

[12]:
graph_store.delete_node(jane)
graph_store.check_node_exists(jane)
[12]:
False

Kùzu provides a convinient frontend called Kùzu Explorer that is useful for debugging and exploring the graph store.

The simplest way to run it is using the following command:

docker run -p 8000:8000  -v /absolute/path/to/the/db:/database --rm kuzudb/explorer:latest
You can then access it at http://localhost:8000.
To display all nodes and relations, you can run the following query:
MATCH (A)-[r]->(B) RETURN *;
[13]:
print(f"docker run -p 8000:8000  -v {graph_store.database_path}:/database --rm kuzudb/explorer:latest")
docker run -p 8000:8000  -v /Users/whimo/motleycrew/examples/example_db:/database --rm kuzudb/explorer:latest