ABBYY Artificial Intelligence
Purpose-Built AI Center
At the heart of ABBYY solutions we employ a combination of technologies to deliver best-in-class intelligent document processing (IDP).
Innovative AI is built into ABBYY’s IDP platform in all steps within the intelligent document processing pipeline, from image enhancement to object detection, OCR/ICR, classification, extraction from semi-structured documents, and extraction from unstructured documents.
Using the right combination of technologies and techniques, ABBYY IDP solutions can process any kind of document—any format, any language, any structure. All our specialized techniques have been optimized for the best possible inferences and the least amount of resources required so they can have optimal cost and deliver the –greatest ROI for our customers.
Cutting-edge AI tools powering ABBYY’s
purpose-built solutions
A combination of highly optimized for the task AI models and algorithms.
Phoenix 1.0
Phoenix 1.0 is a cutting-edge multimodal model that combines advanced image and text analysis by integrating Convolutional Neural Networks (CNNs) for visual data processing with the RoBERTa language model for text comprehension. Phoenix features an innovative AI-driven pipeline that offers zero-shot key/value pair extraction capabilities, automating the most cumbersome tasks of document model training. Unlike broader language models on its own that address a wide range of language understanding tasks, Phoenix provides a more robust framework for document processing, particularly when dealing with multimodal data. It offers enhanced capabilities in feature extraction, efficiency in processing workflows, and a deeper understanding of context that broad language models alone may not fully achieve. This specialization makes it an ideal choice for use cases that rely heavily on information transmitted through documents, ensuring data is processed with precision and swift turnaround times.
Phoenix was developed with a targeted focus on enhancing the efficiency and effectiveness of document processing tasks. By leveraging the strengths of Convolutional Neural Networks for image analysis alongside the advanced language comprehension of RoBERTa, this integration allows for a nuanced understanding of complex documents that contain both textual and visual elements. This focused approach means that businesses can achieve superior accuracy in extracting and analyzing information compared to using general-purpose models. Furthermore, the design minimizes resource consumption by streamlining the processing workflow, ultimately improving speed and reducing operational costs. As a result, organizations can process documents more effectively, yielding significant value in the realm of document processing and enhancing overall productivity.
Machine learning
Our intelligent document processing leverages a mix of technologies to deliver unparalleled performance. A combination of deep machine learning and fast machine learning maximizes the straight-through processing (STP) rate. With our document-specific AI models, pre-trained using deep machine learning, our customers can achieve as much as 90 percent accuracy right out of the box. But with the inclusion of fast machine learning, that accuracy climbs above 95 percent. Fast machine learning will memorize the outliers that deep machine learning couldn’t get, and it works quickly, with just a few variations of the documents in question. And with the data we collect from that process, our deep learning continually improves to deliver higher and higher accuracy over time.
Deep learning allows us to pre-train AI models specifically for document processing tasks. Unlike general-purpose LLMs or Gen AI, which are designed for a broad range of tasks, our deep learning models excel in their specialized purpose, providing more reliable and accurate results.
- Deep machine learning (ML) uses CNNs (convolutional neural networks), RNNs (recurring neural networks), and NLP (natural language processing) to extract information from semi-structured documents. It generalizes across various document formats, effectively handling unseen layouts without relying on templates. Although it requires a substantial amount of labeled data—between 500 to 10,000 documents—for accurate field extraction, the extended training process ensures high precision, making it a powerful tool for complex data interpretation.
- Fast machine learning (ML) focuses on textual and visual patterns, working efficiently with as few as one or two documents per set. It uses clustering technology that groups similar-looking document layouts together and internally trains a field extraction model for each cluster. Unlike deep ML, this approach focuses on document variations it has already “seen” rather than generalizing the patterns. Its clear advantage is that it accelerates the learning process, requires less CPU power and yields shorter processing times.
OCR & ICR – optical character recognition and handwriting recognition
ABBYY is a pioneer in optical character recognition technology, actively researching and innovating in this area since 1993, when our first “omnifont OCR system” ABBYY FineReader was launched to the market. Over the years, the technology has evolved from recognizing individual characters, identifying words, and reproducing page structure, to applying adaptive document recognition technology (ADRT®) that understands documents in their entirety, including layout, multi-page structure, and elements such as header, footer, and table of contents..
With the advancements of AI, ABBYY has developed and solidified its end-to-end approach to OCR and ICR in the last several years. This approach uses the same technologies that are the basis of generative AI tools—convolutional neural networks, transformers, and language models.
The convolutional neural network breaks apart an image of handwritten or printed text on a document into its bits and bytes, trying to make sense of what it actually is. All that input from the CNN then goes into a transformer to provide a potential outcome of a word. Then, we introduce our very own LM, which is trained on billions of parameters, with the specific function of being able to take the context of all of the different words in a group and make the best use of that info to come to a conclusion. This technique drastically improves the performance and accuracy of our OCR and ICR capabilities overall, and it is leveraged in combination with our statistical approach. Our AI will automatically decide which approach is best fit for your document use cases to optimize on the fly for consistency, accuracy, and speed, leading to better straight-through-processing rates.
Computer vision
ABBYY leverages advanced computer vision technology as a key component of its intelligent document processing solutions to enhance automation and data extraction from complex documents. By integrating neural networks, including convolutional neural networks (CNNs) and transformers, ABBYY processes visual content such as text, images, and even handwritten documents. The CNNs break down visual elements in documents, identifying patterns in printed or handwritten text, while transformers analyze the context to improve accuracy in word and character recognition. This technology enables ABBYY to accurately interpret and classify a wide range of document types, from structured forms to unstructured, text-heavy content.
Furthermore, ABBYY's solutions incorporate object detection techniques to identify features like barcodes, signatures, and stamps, which are essential for applications in industries like insurance and logistics. By combining computer vision with language models and other AI technologies, ABBYY enhances document processing capabilities, allowing businesses to automate workflows more effectively, reduce manual errors, and improve straight-through processing rates.
Natural language processing
ABBYY's implementation of natural language processing (NLP) within its intelligent document processing solutions offers transformative advantages for enterprises seeking to optimize their document management processes. By utilizing advanced NLP techniques such as named entity recognition (NER), deep machine learning (DeepML), and summarization, ABBYY’s Vantage platform excels in extracting structured data from both structured and unstructured documents efficiently. Through the integration of deep learning capabilities, the platform provides a customizable NLP system that aligns with unique business requirements. Both developers and business users can train these systems to recognize customized named entities, ensuring a tailored solution while maintaining transparency and control over the models used. This capacity facilitates quicker and more precise business operations, exemplified by accelerated loan processing and streamlined contract management.
The application of ABBYY's NLP capabilities brings substantial benefits to enterprise settings. These include enhanced operational efficiency through the automation of routine document tasks, notable improvements in data extraction accuracy and reliability, and an increase in processing speeds that supports faster decision-making processes. Furthermore, ABBYY's solutions are instrumental in compliance and data privacy management by precisely identifying sensitive information in adherence to regulatory standards. Industries such as banking, finance, legal, and healthcare are particularly well-suited to leverage these advanced techniques including segmentation, query generation, and summarization. These tools enable organizations to convert raw data into actionable insights, thereby enhancing service delivery to customers and boosting operational efficiency across the board.
NeoML
NeoML is ABBYY's comprehensive, open-source machine learning framework designed to cater to both deep learning and traditional machine learning tasks. This versatile tool supports over 100 types of neural network layers and more than 20 traditional machine learning algorithms, making it adaptable for a wide range of applications such as computer vision and natural language processing. NeoML's compatibility with cross-platform environments, including Windows, Linux, macOS, iOS, and Android, ensures seamless integration within existing enterprise infrastructures. Additionally, NeoML supports the Open Neural Network Exchange (ONNX) format, which allows for interoperability with other machine learning tools, further enhancing its utility in diverse programming environments through languages such as C++, Java, and Objective C.
For enterprises, NeoML offers a robust, scalable, and cost-effective solution to deploying machine learning models across various business functions. Its open-source nature, sanctioned by an Apache 2.0 license, means organizations can tailor NeoML to their specific needs without incurring high costs, thereby maximizing resource allocation efficiency. With comprehensive community support, NeoML benefits from continuous improvements and updates, ensuring enterprises have access to cutting-edge machine learning capabilities. The framework's high performance, enabled by both CPU and GPU support, guarantees rapid data processing and timely results, making it an ideal choice for businesses looking to leverage machine learning to drive innovation and optimize operational efficiencies.
Carlsberg
Carlsberg speeds time to market for beverages
deliveries and customer satisfaction
hours saved per month
touchless order processing
U.S. FDA uses IDP to protect public health
capture accuracy of critical details
complex fields on two dozen forms
year archive of forms transformed
Improving the customer journey with process intelligence
“ABBYY Process Intelligence has helped us make a real cultural change: to rely on data and make data-driven improvements.”
- Simon Higgs, Director of Business Transformation
U.S. FDA
Carlsberg speeds time to market for beverages
deliveries and customer satisfaction
hours saved per month
touchless order processing
U.S. FDA uses IDP to protect public health
capture accuracy of critical details
complex fields on two dozen forms
year archive of forms transformed
Improving the customer journey with process intelligence
“ABBYY Process Intelligence has helped us make a real cultural change: to rely on data and make data-driven improvements.”
- Simon Higgs, Director of Business Transformation
Emerson
Carlsberg speeds time to market for beverages
deliveries and customer satisfaction
hours saved per month
touchless order processing
U.S. FDA uses IDP to protect public health
capture accuracy of critical details
complex fields on two dozen forms
year archive of forms transformed
Improving the customer journey with process intelligence
“ABBYY Process Intelligence has helped us make a real cultural change: to rely on data and make data-driven improvements.”
- Simon Higgs, Director of Business Transformation
Carlsberg
U.S. FDA
Emerson
Carlsberg speeds time to market for beverages
deliveries and customer satisfaction
hours saved per month
touchless order processing
U.S. FDA uses IDP to protect public health
capture accuracy of critical details
complex fields on two dozen forms
year archive of forms transformed
Improving the customer journey with process intelligence
“ABBYY Process Intelligence has helped us make a real cultural change: to rely on data and make data-driven improvements.”
- Simon Higgs, Director of Business Transformation
The world’s leading companies trust ABBYY
10,000+ customers
400+ patents
30+ years’ experience
Our tenets of trustworthy AI
Commitment to responsible data science
We are committed to rigorous product development based on responsible data science principles of confidentiality, accuracy, and security embedded into our AI-driven intelligent document processing and process intelligence product portfolio.
Safeguarding the confidentiality of personal information
We incorporate privacy-by-design methodologies that empower our customers to control and limit the collection and processing of personal information.
Ensuring product accuracy and quality
We adhere to the development of our AI-driven product portfolio that meets industry standards for information accuracy.
Conforming to industry standard of security
We deploy zero-trust security principles designed to minimize cybersecurity risks.
ABBYY’s approach to ethical AI
Purpose-built AI that delivers business value and utility
- We are committed to providing transparency relating to the capabilities of our AI products and facilitating customer feedback.
- We are committed to AI risk management frameworks to provide a structured approach to assessing risks across the life cycle of our AI-enabled product portfolio.
- We are committed to complying with applicable regulations relating to data privacy and AI.