CLOSE

Introducing Siren Platform 10.5: Knowledge Graph "Augmentation on Demand" via web services, introducing NLP and much more

Published: Thursday, May 28th, 2020

We are very proud to announce the availability of Siren Platform, version 10.5. This is a breakthrough version for a number of reasons.

On the one hand, it provides major improvements to the existing capabilities, from scalability to responsiveness, with greatly increased join performance in many common cases.

On the other hand, it introduces a number of features, which – combined with the core Siren functionality – are nothing short of state-of-the-art for investigative intelligence.

But, let’s look at these in order…

First, welcome to Siren’s new look

We are happy to present Siren Platform’s new look, with harmonized icons, buttons, and color palette. We hope you like it as much as we do! 🙂

Support for Elasticsearch 7.x and introducing Siren Platinum Edition

Siren Platform version 10.5 is now compatible with Elasticsearch versions 7.x. and 6.x.  Version 10.5 of Siren Investigate is distributed with Elasticsearch 7.3, but this can be updated by using a later version of the Siren Federate plugin up to 7.6 and beyond as we release them.

We are also excited to announce that Siren is now teaming with Elastic.co to serve customers with high-end needs in the U.S. Public Sector space: Siren Platinum Edition.

Siren Platinum Edition combines Siren Business and IT Edition capabilities with those in Elastic Platinum Subscriptions, with support provided by Siren and backed directly by Elastic.co.

“Combining the power of Elasticsearch with the Siren Platform ensures that customers have an integrated and supported solution from the point of data ingestion through the investigative analysis and workflow process. We are aligned with Siren’s mission and leadership and are thrilled to build a partnership that will put the power of its investigative intelligence platform into the hands of organizations addressing some of the world’s most important challenges.” (read more)

George Young, Vice President, U.S. Public Sector at Elastic.

Contact us directly for more information about Siren Platinum Edition.

Performance improvements and even more workload safety nets

Customers with very large datasets will greatly benefit from Siren Platform, version 10.5.

Firstly, general responsiveness and front-end performance has been improved, with a reduced bundle size and improved efficiency of dashboard rendering, shaving 3-to-6 seconds on each dashboard change in typical scenarios.

In the back-end system, the new Siren Federate plugin has a rewritten schema planner, which eliminates previous bottlenecks and provides better speeds in large multi-shard index scenarios.

In the front-end system, Siren Investigate now handles a typical problem of large Elasticsearch deployment: filters being changed by operators in a way that causes too many results to be processed. For example, changing filters on a busy dashboard can cause a heavy load on the system, which can impact on other Elasticsearch cluster users.

To deal with this, Siren now has a series of limits that can be optionally set per index pattern search:

  • Limit on the maximum time span.
  • Limit on the maximum number of returned results for dashboard rendering.
  • Limit on the maximum number of records before allowing a join to be executed.

These are set in the Index Pattern configuration, on the new Options tab:

Once the limits are set, the interface then acts accordingly.

For example, when a dashboard is above the set limit for joins, the relational navigation buttons will not calculate the numbers of connected documents right away, but instead show refresh buttons that can calculate the counts on demand .

The counts and the relational navigation are made available again by filtering the dashboard down (for example, by reducing the time span) or, if the user is authorized, by accepting a warning message.

An administrator has set a limit of 100k documents for a join. A power user is informed of this when clicking the relational button, but can decide to continue anyway.

Knowledge graph “Augmentation on Demand” via web service support

In Siren, a data model is used to virtually connect organizational data – from databases to Elasticsearch clusters – as a single knowledge graph. Siren 10.5 introduces drivers that connect external web services to this knowledge graph so that it can grow as investigators ask questions.

At a certain point of an investigation, one might target an IP, or a company or a user nickname. In Siren 10.5 it is now possible to invoke external web services like VirusTotal, Shodan and WebHose to fetch relevant information, which can be incredibly useful to create new connections with other data to paint a fuller picture.

One can discover, for example that other IP addresses are connected to the one in question and that these have appeared in logs before.

When accessing web services Siren stores both the “questions” and the “results” in Elasticsearch in a way that fits with the regular Siren data model so that any new information can fit with your existing data, automatically.

Siren 10.5 comes with a set of examples for commonly-used web services, such as Webhose, JsonWhois, and Twitter, as well as documentation to create your own web service driver for other APIs. Web services can then be used as part of graph scripts, dashboard scripts, alerting scripts, and with new visual components.

In the following screenshot, a query “covid Kuala Lampur” is fired (1) to an external news provider Webhose. The results are immediately visualized in a table (2), but what’s even more interesting is that the records are now stored in Elasticsearch. Using the data model with NLP (provided either by the service or by the new Siren NLP plugin), we have links to related records in other dashboards (3) while visualizing it all together in graph (4).

Create custom behaviors with the Siren Scripting API

We’re extremely excited to introduce the Siren Scripting API capability: generic scripting that can be used throughout the platform to automate workflows and create ad-hoc visualizations and behaviors.

Scripts are written by administrators in JavaScript and executed in a safety sandbox on dashboards. Just a few examples of behaviors that can be scripted (but possibilities are endless):

  • On selecting a record: automate the graph browser to show all first and second level connections
  • On selecting a record: tailored formatting of results (e.g. summary view) executing ad-hoc queries and invoking web services
  • Create a custom panels for web services: take inputs from filters and time-range that are currently set in the dashboard and process the results in a customized way.
  • Integrate other systems: via embedded IFrames which can get commands from actions on the dashboards.
  • Create advanced custom search boxes: calling out the fields you want people to search on
  • Create visualizations: that pop up on the side of the graph when certain kind of nodes are selected on the graph (via a new graph panel feature)

Here are some screenshots and animation from the above scenarios.

Custom UI and interaction for a COVID-19 tracing demonstrator (fake data used). One enters a phone number and a custom web service uses Elasticsearch to provide a list of possible contacts and contact duration these are then turned into filters one click.

Siren NLP: A quick and easy way to get started with Natural Language Processing

We’re very excited to introduce Siren NLP: a fully integrated free “NLP workhorse” engine, readily available in any Siren installation.

If you have textual data and want to go beyond simple “search” and tag-clouds you will want to use an NLP engine as part of your pipeline.

Siren, by its nature, works with any NLP engine and there are many out there both free and commercial.

Which one is best? Well, as a rule of thumb, it really depends on your use cases – and by all means we recommend that you do your research to appreciate the breath, depth and specializations of each of these which are applicable to your use cases.

Starting from Siren 10.5 however, we’re happy to also provide a handy, out-of-the-box NLP “workhorse”: Siren NLP.

Powered by Apache OpenNLP and enhanced for investigative use cases, Siren NLP provides solid baselines capabilities to extract mentions of companies, people, locations, hashtags, phone numbers, addresses, IBANs, IPs, MD5s, as well as tag terms and concepts you can provide in custom taxonomies and lists.

The following dashboard shows a demo article dataset with widgets that show the tags/entities extracted by Siren NLP.

Here is the idea: start your journey into NLP value extraction with the quick, easy and free Siren Natural Language Processing, and then graduate to your preferred NLP engine of choice as your use case requirements demand.

Siren NLP comes installed as Elasticsearch processor plug in also in our new Siren Easy Start community edition (or download it separately from our portal here).

JDBC/ODBC/Python SQL drivers for Siren

In collaboration with CData, Siren now makes available a SQL driver for Siren – JDBC, ODBC, python and more.

Use the driver for custom data exports to use in scripts and integrations by writing SQL queries. To get data from a saved search or a dashboard one inserts a piece of DSL language into the query, which can be obtained via API. This ensures that all the Elasticsearch and Siren operators can be used to create views, which can then be queries in standard SQL.

Updates to maps and components for positioning use cases

The Enhanced Coordinate Map visualization now allows you to load map references that are stored in Elasticsearch indexes into predefined spatial groups. You can add multiple layers of shapes and points of interest (POI), set properties per layer individually, then arrange and activate them dynamically at dashboard level. Learn more.

Also, a number of components and enhancements have been made for Siren to support advanced positioning use cases:

The graph browser can now be used as a ‘tracker map’ to track the movements of entities, both historically and live. 

Example scripts to track ‘contacts’ among individuals and other proximity use cases are available and can be fired from dashboards thanks to the new scripting and web service capabilities.

More graph capabilities

A new Cards tab is available in the Graph Browser panel.

Graph cards are selection-listening info boxes that can be used for many purposes. An analyst can activate any card available in the menu e.g. to show specific field values of the selected nodes and then quickly select a subset.

Currently the bar histogram card is available pre-installed, but given that cards are scripts written in the new Siren API scripting interface, it’s easy to create new ones that react and display information as required.

Dynamic update on node connection numbers

By default, Siren shows on the nodes the number of outgoing connections. In Siren 10.5 these have been made responsive to changes in “expansion policies”

Numbers on the graph now change instantly as the user changes which relations are active in the sidebar. Numbers can also be refreshed.

More native, high performance graph algorithms: common communicator

A new common communicator efficient graph algorithm is now available. Following the release of the ‘shortest path’ capability, it is now possible to also find nodes that act as ‘communicators’ among 3 or more other nodes.

A typical use case is this: suppose you have 3 or more people, places, locations or in general entities of any kind. Is there some other entity that is tying them together?

Microsoft identified as “common communicator” – between other companies, via article mentions

Better alerting and Internationalization

Very notable improvements to Siren Alert. Alerts can now be versioned, with older versions becoming replaced by newer ones while in production. Editing is greatly improved and so is the flexibility of creating your own UI and logic for alerts.

Access control is now enable for alerts, which become readable only by the users that are authorized to do so.

What’s next in Siren 11? Investigative Dataspaces, Jira integration, extended NLP support and document revisions

We’re excited to announce that Siren 11 will be bringing a concept of “Investigative Dataspaces”.

Create data spaces for your investigation with the ability to clone from curated “template” environments and existing data models. Customize dashboards, visualizations, and data model per investigation, upload and connect your investigation specific CSV file imports.

Also, Jira integration (and great recipes to integrate Jira workflows with Siren) is also coming as part of the next release as well as improved NLP support in visualizations.

Finally we’re thrilled to say that we’ll be introducing the ability to revise data without changing the original records with the introduction of an “Annotation Index” acting transparently (and at full scale).

Available now (also in the new EZ start download bundle)

Siren 10.5 is now available for download, and we’re happy to share that we now have a much improved EZ Start download bundle – with matching improved Getting Started tutorial (now also covering the NLP part).

As always, we looking forward to hearing from you on our community, or simply drop us a mail.

Get Notified

We'll inform you of major releases and upcoming features.