Graphs - Knowledge Graphs, Networks, and Databases
Graph networks are a set of nodes connected by edges. The edges can be directed or undirected. The nodes can be anything, such as people, web pages, computers, or airports. The edges can represent anything, such as friendship between people, hyperlinks between web pages, or flights between airports.
Graphs
There is a ton to cover.
Some topics I'll cover:
Graph theory from a mathematical perspective
Structuring data as a graph
Algorithms for working with graphs (e.g. traversals, shortest paths, etc.)
Graph neural networks, transformers, and other ML models for graphs. This c∑ould be using a graph part of the architecture or as an interaction with a graph.
Graph db's and other tools for working with graphs
Graph visualization (lots of fun pretty pictures and animations)
Using AI to augment graph data (e.g. link prediction, node classification, etc.)
Public Knowledge Graphs
wikidata - a free and open knowledge base that can be read and edited by both humans and machines.
dbpedia - a crowd-sourced community effort to extract structured content from the information created in various Wikimedia projects.
PyKeen PyKEEN (Python KnowlEdge EmbeddiNgs) is a Python package designed to train and evaluate knowledge graph embedding models (incorporating multi-modal information).
enumo / ruler is is a domain-specific language for programmable theory exploration. It uses equality saturation to infer small, expressive rulesets for a domain.
Fast, lightweight and schema-less search backend. It ingests search texts and identifier tuples that can then be queried against in a microsecond's time.
Sonic can be used as a simple alternative to super-heavy and full-featured search backends such as Elasticsearch in some use-cases. It is capable of normalizing natural language search queries, auto-completing a search query and providing the most relevant results for a query. Sonic is an identifier index, rather than a document index; when queried, it returns IDs that can then be used to refer to the matched documents in an external database.
A strong attention to performance and code cleanliness has been given when designing Sonic. It aims at being crash-free, super-fast and puts minimum strain on server resources (our measurements have shown that Sonic - when under load - responds to search queries in the μs range, eats ~30MB RAM and has a low CPU footprint; see benchmarks).
Sonic is integrated in all Crisp search products on the Crisp platform. It is used to index half a billion objects on a $5/mth 1-vCPU SSD cloud server (as of 2019). Crisp users use it to search in their messages, conversations, contacts, helpdesk articles and more. Test it here
Neo4j - a graph database management system developed by Neo4j, Inc. Described by its developers as an ACID-compliant transactional database with native graph storage and processing. Neo4j is the most popular graph database according to DB-Engines ranking, and the 22nd most popular database overall.
KuzuDB - Embeddable property graph database management system built for query speed and scalability. Implements Cypher. C++ core with Python, Rust, NodeJS, Java, and CLI clients.
LangGraph is a library for building stateful, multi-actor applications with LLMs, built on top of (and intended to be used with) LangChain. It extends the LangChain Expression Language with the ability to coordinate multiple chains (or actors) across multiple steps of computation in a cyclic manner. It is inspired by Pregel and Apache Beam. The current interface exposed is one inspired by NetworkX.
The main use is for adding cycles to your LLM application. Crucially, this is NOT a DAG framework. If you want to build a DAG, you should use just use LangChain Expression Language.
Cycles are important for agent-like behaviors, where you call an LLM in a loop, asking it what action to take next.