Over the past few decades, programming languages have evolved from labor intensive and error-prone Assembly Languages to high-level programming languages, with support for concurrency and distributed computing. Despite these advances, writing code has continued to be a highly specialized skill that requires years of practical experience; AI / ML / NLP technologies in particular come with even more complexity – requiring manipulation of multiple big data sources, building models, managing pipelines and operating at scale.

This post describes why and how Accern went about solving this problem with no-code NLP.

Why We Coded for No-Code NLP 

AI / ML / NLP  technologies have the potential to solve problems across every industry – from managing asset portfolios, credit risk analytics for commercial banks, extracting customer insights in insurance and ESG-focused investing. It is more necessary than ever to help people quickly solve problems, at scale, without requiring them to be programmers or data scientists.

Global spending on artificial intelligence (AI) is forecast to double over the next four years, growing from $50.1 billion in 2020 to more than $110 billion in 2024 and NLP budgets are growing 10-30% according to another industry survey.

IDC Survey

Our mission at Accern is to lower the barrier to entry and empower more people use solve real-world problems with AI / ML / NLP. After spending the first few years working closely with business users at various financial firms, we decided to build reusable components that would be generic enough to solve a large array of problems, yet have the potential to be customized for specific requirements.

The Accern NoCodeNLP Platform provides a solid foundation for business users and data scientists alike to immediately build and deploy NLP use cases. While general-purpose Data Science Machine Learning (DSML) platforms require months (and sometimes years) of effort with substantial investment in capital expenditures, Accern enables users to harness the power of NLP at a fraction of the cost with massive gains in performance, accuracy and productivity.

Common Issues with No-Code NLP

To build a no-code solution, we identified the most common issues that NLP use cases encounter:

  • Functional issues:
    • Business use case not supported
    • Lack of customization in the models
    • Accuracy of the models
  • Technical issues:
    • Difficulty integrating with existing tools
    • Scalability issues
    • Security in the cloud
    • Rising infrastructure cost

While there are several no-code products that have simplified the process of building web applications, building such a tool for NLP still requires highly specialized skills. Accern has had the privilege of solving real world problems faced by domain experts and business users and we have been able to capitalize on this knowledge to build our No-Code NLP platform; we had to code to enable no-code.

Solving the Functional Issues of No-Code NLP

After spending many man years building AI / ML / NLP-driven workflows and processing hundreds of millions of unstructured documents, we were able to identify the common components that substantially reduce both the cost and time taken for running NLP models on a large dataset.

Use-Case Driven Approach

We adopted a use-case driven approach and analyzed over 400 use cases and related workflows to ensure that people with no data science or technical background could build and deploy NLP models.

We tested use cases related to ESG Behaviors, Covid-19 Impacts on Credit Risk, spotting Mergers and Acquisitions opportunities and monitoring of targets and more. We made it easy for the citizen data scientist to simply click to quickly research, summarize and extract insights.

Pre-Built and Custom Use Cases

Users can choose from a number of pre-built use cases or deploy workflows specific to their own requirements. Each workflow is secure and available as a self-service tool in the Accern platform giving users the ability to deploy use cases within a few minutes vs. what would otherwise take months (or years) to build and deploy.

Benchmarked for Accuracy

The Accern Entity Extraction and Sentiment Analysis models have 99.7% and 93.5% accuracy rates respectively and customers have access to our benchmark datasets to compare and evaluate our performance with third-party models.

Learn more about Accern sentiment analysis in this downloadable eBook, A Benchmark of Popular Sentiment Analysis Models on Financial News.

Solving the Technical Issues of No-Code NLP

Data and Dashboard Connectors for Integrations

Accern supports a wide range of connectors for sourcing data as well as for sending the output of a pipeline to external systems. Customers not only have the ability to choose the connector they prefer but also choose the format of the data they want to send.

The platform provides out-of-box support for connections to many data warehouses including Snowflake, Amazon Redshift, Microsoft Azure, MongoDB. The connectors are deployed in a highly resilient infrastructure available across multiple data centers for high availability.

Scalable Adaptive Infrastructure

Due to the scale of the data we had to process, it was important to build our solution as a cloud-native application. The platform needed to support both multi-tenancy and on-premise deployment to accommodate different customer needs. We solved this problem by using Kubernetes for managing containers. Auto scaling groups allowed us to scale the appropriate number of resources to run a pipeline depending on the workload.

We also made a conscious decision to manage the core technology stack including the databases, interprocess messaging and data pipeline ourselves. This gives us the ability to deploy the platform on any SaaS and Cloud provider thereby avoiding vendor lock-in. Using Grafana and Prometheus, system telemetry is monitored in real-time for monitoring and alerting of the services in the cloud.

Cost Optimization

On average, 80% of the cloud infrastructure costs are due to compute resources. The Accern NoCodeNLP Platform reduces the costs of deploying NLP use cases by separating computing resources into different categories (Stateless, Serverless, Stateful etc).

When a use case is deployed, the Accern platform determines the expected compute resource requirements by utilizing proprietary metrics called Units. This also helps in ensuring that the resource acquisition is performed in a cost-effective manner. Shared components are made accessible via APIs to further reduce the costs for running complete pipelines.

Isolated Resources for Security

Services are deployed in a Secure VPC with automated monitoring and alerting enabled for each pipeline. The services are distributed across multiple data centers for high availability. With regular backups and best practices for data recovery, we are able to assure our clients that the data will be available even in case of a Disaster Recovery (DR) scenario.

For each pipeline, client-specific components are deployed in their own isolated containers to guarantee both the availability as well as security in a multi-tenancy environment.

Access to production data is highly restricted via RBAC thereby guaranteeing the integrity and security of the data.

What’s Next for No-Code NLP

It is unlikely that no-code will completely eliminate the need for writing code. There will always be developers working at companies like Accern writing the necessary code to push the no-code  movement further.  What is certain, though, is that no-code NLP platforms will continue to become more widely accepted as more people discover their power. As technologists, we will strive to make it even more easier for business users to utilize the power of AI / ML / NL.