Back
Blog 16 Jul 2025

Siren Embraces GQL to Unlock the Full Power of Knowledge Graphs

Author: Renaud Delbru
Author Renaud Delbru
Siren Embraces GQL to Unlock the Full Power of Knowledge Graphs

The evolution from search to intelligence 

Modern investigations operate across billions of documents, countless relationships and multiple data types. The tools we use shape how effectively we can navigate this complexity. For the past decade, Siren has built increasingly sophisticated knowledge graph capabilities, from distributed joins to path finding at scale. Yet we have always known there was a cognitive gap between our graph technology and the people who used it. With our adoption of GQL, we’re aligning the process of interacting with our graph with how investigators naturally think about the relationships. This doesn’t represent a change in direction, it is another step in our ongoing effort to make graph-based investigation more accessible to those who need it most, while establishing the structured foundation necessary for AI to augment human investigation tomorrow.

1. The transformation of investigation and why Knowledge Graphs matter?

The world of investigations has been changing gradually for a long time now when suddenly, with new geopolitical tensions, the urgency has dramatically increased. In contemporary investigations, workloads are defined by their unpredictability, complexity and sheer scale, whether the threat is cyber, AI generated, cognitive or hybrid. Whether you’re a law enforcement agency mapping the intricate networks of suspects, vehicles, communications and locations or a financial institution tracing the flow of funds across borders and entities, or a journalist connecting the dots across millions of leaked documents, the volume and complexity of data is unprecedented. Even before the era of AI generated content, we were talking about billions of structured records, unstructured text, images, videos and network logs.

Knowledge graphs have emerged as a natural framework for this challenge. They provide a unifying abstraction, allowing organizations to represent and navigate these complex relationships across all these data types.

But there’s been a persistent challenge. The tools to query knowledge graphs weren’t designed with how investigators actually work. Real investigations are iterative and exploratory. Investigators start with fragments and progressively expand their understanding through exploration, or they scan globally for patterns then zoom in on what stands out. Each discovery reshapes the investigation’s direction.

This requires more than traditional search or database queries. It requires systems that can seamlessly blend three essential capabilities; 1, information retrieval across text and multimedia, 2, graph traversals for relationship discovery, and 3, analytical operations for pattern detection and statistical insights. Investigators need to express these complex operations as intuitively as they think about them, without disrupting their cognitive flow with context-switching between tools or response delays.

2. The journey from JSON to GQL

Siren’s journey began with a clear vision – to empower investigative teams with the ability to ask and answer complex questions over their data, no matter the scale. We built our foundation on proven search technology, starting with Apache Lucene’s powerful indexing and retrieval capabilities, then leveraging Elasticsearch’s modern distributed framework and REST API. By extending Elasticsearch’s JSON-based interface with our distributed join algorithms, we added the crucial missing piece: relational search capabilities that could connect data across indices. This approach felt natural. It was familiar to Elasticsearch users and enabled set-to-set navigation for investigative workflows. It serves us well, powering real deployments across hundreds of nodes and terabytes of data1.

But as investigative patterns grew more complex, the limitations of this approach became apparent. What started as simple relationship queries evolved into sophisticated graph algorithms. JSON’s hierarchical structure increasingly clashed with graph patterns. Technical users working directly with our API often expressed frustration with its limitations. Representing complex patterns required massive, deeply nested queries with duplicated paths. Even our own engineers found operations like variable-length traversals challenging to implement and error-prone. Users who naturally think in relationships had to translate these into convoluted constructs, while non-technical investigators couldn’t engage with the API at all.

The mismatch was fundamental.  We think in relationships, not in JSON hierarchies. Query languages should reflect how investigators naturally approach problems.

3. GQL as the natural choice in 2025

The emergence of GQL as an ISO standard in 2024 marked a turning point for the graph database community. GQL finally represents a consensus on how to express graph operations effectively. For Siren, with our decade-long focus on knowledge graphs, GQL was the natural next step. The ISO standardization provides stability and interoperability. But its adoption is about more than keeping pace with industry standards; it also offers something much more essential, a way to express investigations without the mismatch we’d been fighting. Its syntax naturally accommodates the hybrid queries that define real investigations, where finding suspicious transactions might require combining text search, geographic boundaries, and multi-hop graph traversals in a single, readable expression.

The practical impact is immediate. What once required 50+ lines of nested JSON can now be expressed in a single, readable GQL statement:

MATCH (:Suspect) -[:calls]->{1,3} (:Person) -[:ows]-> (:Account WHERE "suspicious: true")

This single line captures a complex investigation pattern: find suspicious accounts owned by people who are within three call-hops of a suspect. Investigators no longer need to translate between mental models. This extends across different query patterns:

Pattern Matching – Connecting entities through known relationships

 MATCH (:Person) -[:owns]-> (:Vehicle) -[:seen_at]-> (:Location)

An investigator tracking vehicle movements can directly express the connection chain from owner to location sightings.

Variable-Length Paths – Exploring communication networks of varying depth

 MATCH (:Suspect) -[:communicates]->{1,5} (:Person)

When investigating who a suspect might reach through their extended network, users can search across one to five hops of communication.

Shortest Path – Finding the most direct connection between entities

 MATCH ALL SHORTEST (account1) -[:transfers]->* (account2)

For financial investigations, this reveals the most direct money flow between two accounts.

4. Purpose-built graph query engine innovation

GQL is more than a new syntax. Under the hood, it’s powered by a purpose-built graph query engine, layered atop Siren Federate’s proven distributed join framework. This engine draws on a decade of experience in high-performance distributed systems and brings a host of technical innovations2.

Iterative computation, such as our Semi-Join Decomposition technique for path queries3, enables complex graph traversals at scale, processing expansions incrementally rather than materializing massive intermediate results. Adaptive query planning leverages runtime statistics, continuously refining execution strategies as data patterns emerge. Semantic caching eliminates redundant computation during exploration. Columnar processing and distributed execution maintains high performance and throughput even as relationship complexity grows.

Perhaps most importantly, the engine preserves full integration with Siren Federate’s search capabilities. Within a single query, you can combine graph traversals with full-text search, vector similarity and spatial analysis.

Performance matches investigative needs. Common operations execute responsively even at scale. Naturally, some patterns are computationally expensive, but our optimizations ensure even demanding queries remain practical at scale.

5. Transforming investigation workflows

The adoption of GQL affects different users in distinct ways. For analysts, one of the immediate benefits is cognitive alignment. Complex investigations can now be expressed naturally without mapping mental models to raw data structures and convoluted JSON transformations. Interacting with the system becomes not just more intuitive, but more powerful through the language’s enhanced expressiveness.

For developers, GQL provides a standard language designed for graph operations, extended with the capabilities of Siren’s search. One major impact is the reduction in development complexity: what required hundreds of lines, if not more, of custom logic can be now expressed in concise and more readable GQL statements. Code reviews become clearer when queries express intent directly. This accelerates development cycles while the language’s expressiveness unlocks new possibilities.

For organizations, adopting an ISO standard provides confidence in long-term support. Investments in training and tooling are protected from proprietary obsolescence.  Interoperability improves as investigation results can flow seamlessly between systems. In Siren, investing in GQL means aligning investigative workflows with industry standards while establishing the foundation for AI-driven intelligence capabilities on the horizon.

6. The vision, intelligent reasoning on knowledge graphs

GQL is not just a feature. It is the bridge between human thinking and graph operations, and the foundation for a more ambitious transformation. Our research initiatives are exploring how foundation models can bridge the gap between natural language and graph operations, opening investigations to a broader audience.

The roadmap unfolds progressively, including natural language interfaces that translate investigative questions into GQL queries. For example: “Show me all suspicious transactions linked to this network.” Adaptive workflows will learn from investigative patterns, suggesting relevant queries and automatically adjusting strategies based on discoveries. AI agents will autonomously explore graph patterns, surfacing insights that human analysts might miss in the complexity.

Siren technology is  uniquely positioned at the intersection of some truly essential capabilities. Large language models (LLMs) now understand natural language with unprecedented sophistication. Knowledge graphs provide the structured foundation necessary for precise reasoning. Our enterprise search heritage of Elasticsearch delivers advanced text, vector and spatial search alongside proven scalability. Siren Federate’s distributed graph analytics capabilities make complex operations practical at scale. GQL unites these elements into a coherent whole.

Long-term, we envision investigation as a conversation. Analysts describe their hypotheses and questions naturally. The system translates these into optimal query plans, executes across multiple paradigms, and returns synthesized insights. Crucially, the process remains transparent and auditable, essential for accountability.

The implications extend across domains. Frontline officers gain access to the same investigative power as specialized analysts. Journalists can navigate complex document networks without data science expertise. Financial compliance teams can scale their investigations without proportionally scaling their technical staff.

And this is our commitment. To make the full power of knowledge graphs accessible to anyone who needs to uncover truth from data, by building intelligent bridges between human insight and computational power.

7. The impact and the Siren Secret Sauce

The adoption of GQL reveals what makes Siren unique in the investigative intelligence landscape. We haven’t built another graph database or search engine. Instead, we’ve created something that doesn’t quite fit existing categories. A system where information retrieval, relational analytics and graph operations genuinely coexist.

While many platforms excel at individual query types, investigations don’t respect these boundaries. Real investigations demand seamless movement between searching documents, tracing relationships, and analyzing patterns. We architected Siren specifically for this reality, optimized for the investigative workflow itself. This means accepting complexity. It means solving hard problems like distributed joins at scale and path enumeration without memory explosion. These aren’t academic exercises; they’re the foundations that make fluid investigation possible.

By unifying search capabilities with distributed joins and graph operations, we enable investigators to express complex multi-paradigm queries naturally. GQL amplifies these capabilities by providing a standard language that preserves our unique multi-paradigm approach while opening paths to future AI integration.

We’ve spent a decade exploring how computational systems can support human investigation. GQL is the next step in that journey. It empowers analysts to ask smarter questions, developers to build more powerful tools and organizations to evolve toward intelligence-driven operations.

Looking ahead, we see GQL as an essential infrastructure for responsible AI integration. Investigations require explainable processes, from query to conclusion. As we develop natural language interfaces, GQL provides the structured intermediate layer that keeps operations transparent. When decisions impact lives and reputations, black-box solutions aren’t acceptable. We see it as our responsibility to ensure that technology amplifies human judgment, not replaces it.

This is more than a technology milestone. It’s a continuation of our commitment to building systems that respect complexity, serve those on the frontlines of investigation, and stay grounded in one simple principle: building tools that help investigators find truth in data, transparently and reliably.


1JSON-based queries remain fully supported in Siren Federate. GQL is an additional interface, not a replacement.

 2Bordea, G., Campinas, S., Catena, M., & Delbru, R. (2025). Siren Federate: Bridging document, relational, and graph models for exploratory graph analysis. to appear in Computer Science and Information Systems

 3U.S. Patent 11,720,564 & European Patent (EP4097605): Method and system for optimizing the sequence of database joins for enhanced reachability and shortest path determination

OTHER AREAS

Explore our topics

Close