AI-900 – Part IV

Azure AI Document Intelligence

Previously known as Forms Recognizer, Document Intelligence is a suite of advanced AI-powered services designed to automate and enhance the processing of business documents. This rebranding reflects the broader capabilities of artificial intelligence in understanding and analyzing documents, making it an essential tool for modern businesses.

Document Intelligence encompasses three interconnected services that work together to streamline document processing:

  1. Document Analysis: This core service reads, understands, and analyzes various types of business documents with remarkable accuracy. It extracts valuable insights and data from documents, transforming them into structured information.
  2. Prebuilt Models: These machine learning models are specifically trained to handle common document types, such as receipts (including hotel and gas receipts), invoices, and ID cards. They identify text and field names, and the extracted data is available in JSON format, making it easy to integrate into databases and other systems.
  3. Custom Models: For organizations with unique document types, Custom Models offer the flexibility to train machine learning models with as few as five sample forms. This service requires an Azure storage account and allows businesses to tailor the models to their specific needs, ensuring accurate and efficient processing of forms that are unique to their operations.
More Than Just Text Extraction

Document Intelligence does much more than simply detect and extract text from documents. It uses advanced AI models that “understand” the content of forms, recognizing specific types of data like addresses, phone numbers, dates, times, and quantities. Additionally, it grasps the relationships between field labels and their corresponding values.

Here’s how Document Intelligence helps with document processing:

  1. Receipt Analyzer: This feature allows you to train custom models using your own scanned forms. If you have specific types of documents or unique formats, you can create models that are tailored to your needs.
  2. Pre-Trained Models: For common documents like receipts, business cards, and invoices, you can use pre-trained models. These models are already trained to handle these types of documents, making it easy to extract information quickly.

Document Intelligence relies on Optical Character Recognition (OCR) as its core technology. OCR not only reads the text but also helps extract the structure, relationships, key-value pairs, and other important details from your documents. This means you get more than just the text – you get a deeper understanding of the document’s content.

Document Intelligence Studio

Getting started with Document Intelligence is straightforward, especially with its no-code approach, which allows you to explore its functionality using sample documents as well as your own.

Here’s a simple guide to help you get started:

  1. Create a Resource:
    • Document Intelligence Resource: This is where you set up your Document Intelligence service.
    • AI Services Resource: Ensure you have the necessary AI services resource to support your Document Intelligence setup.
  2. Enable the Resource in Document Intelligence Studio: Once your resources are created, enable them in Document Intelligence Studio to begin using the service.
  3. Explore the Getting Started Page: On the Getting Started page, you can select a model to try out. This will give you an initial understanding of how the models work and allow you to test their capabilities.

Exercise: Extract form data in Document Intelligence Studio

Azure AI Search

What is knowledge mining?

Knowledge mining is an approach to discovering and extracting valuable insights from vast amounts of data within organizations. Unlike general web search platforms like Google or Bing, which index and search content across the internet, knowledge mining focuses on retrieving and making sense of data that resides internally within a company.

Challenges of Finding Data Within Organizations

Organizations often face several challenges when trying to locate and utilize their internal data:

  • Data Location: Data is frequently dispersed across various storage systems and formats, including documents, PDFs, spreadsheets, and handwritten notes. This dispersion makes it difficult to know where all the relevant data is stored.
  • Departmental Silos: Different departments may hold data in separate systems or formats, leading to fragmented information that can be hard to access collectively.
  • Time and Effort: Scanning and searching through documents manually is time-consuming and labor-intensive. The process of finding specific data often requires significant effort and can delay decision-making.
  • Data Retrieval: Extracting useful information from unstructured data sources (e.g., handwritten notes) adds another layer of complexity, often resulting in inefficient search and retrieval processes.
The Role of Knowledge Mining

Knowledge mining addresses these issues by leveraging advanced technologies to automate and streamline the process of finding and analyzing data. Here’s how it helps:

  • Scalability: Knowledge mining can analyze large volumes of data quickly and efficiently, extracting valuable insights at scale.
  • Insight Discovery: It goes beyond simple keyword searches to uncover patterns, relationships, and actionable insights hidden within the data.
  • Unified Access: By integrating with various data sources and formats, knowledge mining platforms make it easier to access and analyze data from across the organization.
Azure AI Search: Azure’s Knowledge Mining Platform

Azure AI Search is a powerful knowledge mining solution offered by Microsoft Azure. It enables organizations to unlock insights from their data through various features:

  • Search Capabilities: Utilize AI-powered search to quickly locate relevant information. For example, deploy bots that can answer specific questions based on the data.
  • Dashboards: Create visual representations of data to facilitate easier analysis and understanding of key insights.
  • Business Applications: Integrate search and analysis capabilities directly into business applications, streamlining workflows and improving efficiency.
  • Further Analysis: The insights obtained through knowledge mining can be used for more in-depth analysis, driving informed decision-making and strategic planning.

Ingest

Azure Blob Storage containers
Azure SQL Database Documents in Cosmos DB
Azure Data Lake Storage Gen2
Azure Table Storage

AI enrichment & index

AI enables deeper understanding

• Extract information & patterns

Azure AI services

• Vision, Natural Language Processing, etc.

• Indexing makes content searchable

Explore

Search performed on indexes

Results used:

• Within applications

• Create data visualizations

Azure AI Search requires data to be in JSON format. JSON is a lightweight data-interchange format that is easy to read and write for humans and easy to parse and generate for machines. It allows for structured and hierarchical data to be organized efficiently, which is crucial for enabling effective search and analysis.

Explore an Azure AI Search index (UI) – Demo

Leave a Reply

Your email address will not be published. Required fields are marked *

Related Post