Enhancing efficiency and safety in the energy sector with OCR technology

15 min readMar 4, 2024

The advent of artificial intelligence (AI) has opened exciting prospects in data acquisition, generation, and transformation, offering invaluable potential in various industrial sectors and professional settings. More specifically, Computer Vision and Optical Character Recognition (OCR) technologies revolutionize how we handle information by enabling automatic reading and digitization of a wide range of documents, such as handwritten inspection reports, mass plans, equipment photos, invoices, delivery receipts, and many others.

Although, nowadays, information is mainly generated in digital format, there are still many business situations requiring information to be digitized. For instance, whenever textual information is physically stored on mediums like printed or handwritten documents, equipment labels or other sort of objects, it must be digitized for efficient processing and cross-application use. Previously, this transcription task was performed manually, resulting in significant time and quality losses due to either essential data being buried amid less relevant information (e.g., detecting a product anomaly) or dealing with a considerable volume of documents to digitize, as often encountered in invoice processing. Thanks to AI, this tedious operation can now be automated. Once extracted information is available in digital form, its quality can be enhanced through cross-referencing for instance, and then used to feed any sort of business process.

Modern Machine Learning (ML) and AI algorithms can swiftly analyze and correlate digital data, extracting valuable insights and uncovering patterns that might otherwise remain hidden, further demonstrating the essential power of OCR and digital conversion as the first step to optimize the value of information.

AI-driven data extraction and reading find multiple applications, catering to the needs of digitization arising from physical interventions, such as commercial transactions (sales, shipments, receptions, manufacturing) or quality checks (on-site inspections, compliance controls, etc.). In each intervention, the involved parties generate new data, such as reports, observations, signatures, photographs, etc., which subsequently require digitization.

In this article, we will explore in detais two specific use cases of AI-driven data extraction and reading in the energy sector. We will highlight the benefits they bring, particularly in enhancing operational efficiency and paving the way for innovation opportunities. So, let’s delve together into this fascinating universe and discover how these technological advances shape the professional world of today and tomorrow.

OCR technology

In today’s digital age, where information is key, the ability to efficiently convert physical documents, paper documents, scanned images, and PDF files into editable and searchable data is a game-changer. OCR technology is a powerful tool that enables the extraction of textual content from various sources, revolutionizing the way we handle documents. It consists of electronically converting images of typed, handwritten, or printed text into machine-encoded text, making it easier to store, manipulate, and analyze the content digitally.

Figure 1 — OCR allows to extract text from images, PDF or scanned documents and convert it into machine readable text.

Traditionally, OCR falls under the umbrella of Computer Vision, the field of AI that focuses on enabling computers to interpret and understand images. Basically, digital images are 2D arrays of pixels, where the brightness of each pixel is represented by a value (for greyscale images, 3 values corresponding to red, green, and blue brightness for colored images) between 0 and 255, meaning that computers can read images as numeric multidimensional vectors (or tensors). Starting from that, mathematical algorithms can be used to help computers to identify features and patterns when reading images.

Figure 2 — How computers read images, with an example of handwritten “8”

At its core, OCR technology is a multi-step process designed to extract textual content from images. Each OCR algorithm has its own specificities, but generally the main steps are the following:

1 — The journey begins with preprocessing, where the image undergoes analysis and enhancement, ensuring optimal text extraction. Techniques such as grayscale conversion, binarization, noise reduction, rotation correction and many more can be employed to improve the quality of overall output.
2 — The next step is character recognition. The OCR system intelligently identifies areas within the image likely to contain textual content. By analyzing patterns, shapes, and other visual cues, the system effectively pinpoints regions that harbor characters, setting the stage for the character classification process, which is the heart of OCR. Here the system diligently examines each identified text region, striving to recognize and classify individual characters accurately. This is achieved through a combination of methods. Some OCR systems utilize template matching, comparing observed character shapes against a pre-existing set of templates. Others employ sophisticated machine learning algorithms to decipher patterns and make informed character classifications. This multi-step process enables us to extract all the information from documents.
3 — In some cases, OCR technology can go an extra mile with post-processing technics to enhance accuracy and refine the output. Error correction algorithms step in to rectify any misinterpreted characters, while spell-checking mechanisms ensure the recognized text is free from spelling errors. Additionally, formatting adjustments may be made to align the extracted text with standard conventions, making it easily readable and compatible with a range of software applications.

Figure 3 — Main steps of OCR algorithms.

One of the key considerations in OCR is the adaptability of strategies to specific document cases. Most of the modern OCR tools perform well enough and can be directly applied to a wide range of common use cases. In these kinds of tools pre-processing and post-processing steps are already integrated, so the user doesn’t have to implement it. However, there are still use cases which need to handle specific documents requiring a more tailored approach. This is where the fine-tuning process of OCR models comes into play.

Fine-tuning an OCR model is a way to further train an existing OCR model already trained on a vast corpus of data, by exposing it to additional domain-specific data, to refine it so it can excel in specific scenarios of interest. Fine-tuning a model may also include adjusting the model’s parameters or defining custom pre-processing and post-processing rules. For instance, a financial institution seeking to digitize intricate financial documents might fine-tune an OCR model to handle specific terminologies and formats, ensuring a higher degree of accuracy. However, training an OCR model could be a long and costly process as you need the gather a lot of documents, annotate it (to give “feedback” to the OCR model during the training) and have access to computational power (i.e., GPU) for the training with high volumes of documents.

At Sia Partners we deeply think that each use case is different and therefore a thoughtful specific approach driven by the sector and the type of documents is necessary to achieve optimal results. As we operate in many different sectors, our value lies in understanding the business needs of our clients. We quickly adapt OCR strategies, including custom pre-processing, post-processing, and fine-tuning, according to the challenges, business needs and the documents to be analyzed.

Our Computer Vision Lab is dedicated to advancing new computer vision technologies and complementary tools. We operate on the cutting edge of OCR research, continually exploring novel approaches to enhance the accuracy and efficiency of text recognition. Our commitment to excellence is reflected in the rigorous technology watch and benchmarking processes we employ, ensuring that our solutions consistently outperform industry standards. Our arsenal includes open-source and commercial solutions. In our quest for excellence, we are also venturing into the field of multimodal models, designed to simultaneously process and integrate information in multiple forms, such as text, images and audio, enabling models to understand the relationships between text and visual elements within an image to improve accuracy and provide a more holistic interpretation of documents. This strategic approach allows us to tackle even the most complex challenges effectively.

Figure 4 — Example of benchmark of OCR frameworks and tools.

Our approach goes beyond ready-made solutions. We understand that each customer context is unique, and as such, we take a highly customizable approach to OCR implementation. Our team combines relevant algorithmic building blocks to craft solutions tailored precisely to the demands of our clients.

With a diverse array of libraries and models at our disposal, we are well-equipped to revolutionize the energy sector through the seamless integration of OCR technology, driving efficiency, safety, and innovation to new heights.

Use Case #1: Pre-decommissioning analysis of nuclear facilities

Decommissioning a nuclear power plant requires the adherence to several complex and rigorously executed steps. These actions encompass technical, administrative, regulatory, and even political aspects. An essential step in preparation for the decommission is the development of the License Termination Plan, which must be submitted to and validated by regulatory authorities. Typically prepared and submitted two years prior to decommissioning, this plan necessitates the collection of a large volume of documents such as:

License Termination Plan (LTP): This is the primary document outlining the decommissioning strategy and activities. It can be a lengthy and detailed document.
Environmental Impact Statements (EIS): These documents assess the potential environmental impacts of the decommissioning and are often required for regulatory approval.
Safety Analysis Reports: These reports detail the safety measures and controls in place for decommissioning activities.
Radiological Surveys and Reports: These provide data on the radiological conditions at the plant and may be extensive.
Cost Estimates: Detailed cost estimates for decommissioning activities, including workforce, materials, waste disposal, and long-term monitoring.
Public Comments and Responses: Records of public comments and responses during the public involvement process.
Regulatory Correspondence: Correspondence with the nuclear regulatory authority regarding the LTP and approvals.
Historical Plant Records: A wide range of historical records related to the plant’s operation and maintenance.
Technical Specifications: Technical documentation specific to the plant’s design and systems.
Health and Safety Plans: Documents outlining health and safety procedures for decommissioning workers.
Waste Management Plans: Plans for handling and disposing of radioactive and hazardous waste.
Environmental Reports: Reports on environmental monitoring and assessments.
Radiological Release Limits and Standards: Documentation specifying the criteria that must be met for license termination.

The content of these documents is based on pre-exiting documentation related to the nuclear facility and the total number of their pages could easily reach tens of thousands, if not more. In this regard, the recurring difficulties encountered are linked mainly to historical plant records. As nuclear power plants can have a lifespan of up to 40 years, there is a need to sometimes deal with old documents, which may be handwritten and poorly preserved. The size and complexity of the facilities entail a significant volume of documents to process, making the task even more laborious.

OCR opens new perspectives in the pre-decommissioning analysis of nuclear facilities, helping engineering teams to characterize the initial state of a nuclear site in view of its decommissioning. OCR, and more broadly, AI, can help with the collection, extraction, and analysis of key information, simplifying the way of dealing with these delicate projects by guaranteeing a more precise, efficient, and secure approach.

This use case can be structured in the following 5 steps.

Figure 5 — Definition of the pre-dismantling scenario in 5 steps.

Phase 1 — Data Extraction

To define the optimal pre-dismantling scenario described in the LTP, the initial state of the nuclear site needs to be characterized beforehand. During this phase, OCR and Computer Vision techniques can be employed to extract data from various sources of documents, including plans, site history, radiological studies, laboratory notebooks, technical notes, photos of the installation. For this specific use case, two main categories of documents need to be analyzed: those related to central security events and those related to the normal life cycle of a plant.

Figure 6 — Example of document of interest: radiation distribution map on a given portion of a nuclear facility (source: CEA document).

The analysis of documents related to central security events should allow to extract the following information:

Description of each event recorded in each documentation: The details of each incident, accident or safety-related event must be extracted from the relevant documents, including a complete description of what occurred.
Event qualification : Key terms indicating the nature of the event, such as “leak”, “explosion” or “loss of material”, must be identified and used for classifying each event.
Inventory of radionuclides involved in the recorded events: The types of radionuclides involved in each incident must be recorded, which makes it possible to understand the nature of the radioactive contamination.
Quantity (or activity) of radionuclides: Measurements of the quantity or activity of radionuclides released during each event should be extracted, providing crucial information on the extent of radioactive contamination.
Names or references of equipment/people involved, location: all parties involved in the event and location of the incident must be identified for effective management and rapid response.
Radioactive Dose Rate Records: Data regarding radioactive dose rates measured at the event should be extracted to assess radiation levels and impact on worker safety.

On the other hand, the analysis of documents related to the normal life cycle of the plant should allow to extract information about:

Inventory of nuclear waste recorded during the life of the plant: including the types of waste and their origin, must be extracted.
Quantity (or activity) of radionuclides: Measurements of the quantity or activity of the radionuclides contained in the registered nuclear waste must be extracted for the appropriate management of this waste.

Phase 2 — Data Classification

Once the data is available in digital form, it is much easier to process it to improve its quality, discard irrelevant information, and classify it according to criteria of interest. At this stage, algorithms based on specific business rules or AI technologies such as Natural Language Processing (NLP) will play a crucial role in automating the information classification process. Thanks to specific trained models, textual data is sorted and classified according to a previously defined hierarchy. This approach ensures optimal data management, making it easier to handle information for the following phase of the project.

Phase 3 — Data standardization

Extracted data comes from a large variety of data sources in terms of formats, templates, stakeholders, and year of writing, making it harder to ensure a general consistency. To tackle this complexity, data standardization becomes an imperative necessity to adopt common formats, definitions, and structures for sharing data in a consistent way across the whole project. This phase also allows us to detect possible redundancies and inconsistencies, thus making it possible to consolidate the extracted data. Once again, AI (e.g., Named Entity Recognition or fuzzy matching algorithms) are invaluable allies for automating this phase, ensuring the construction of a unified and reliable database, guaranteeing a global and precise vision of the initial state of the site.

Phase 4 — Data structuring

Once the data standardized, its structuring becomes essential for a global and precise understanding of the site’s initial state and the history leading to it. Basically, the aim of this phase is to specify the interdependencies between different datapoints which can involve chronological, spatial, and cause-effect relationships. In concrete terms, these interconnections can be expressed in terms of relationships through relational databases, which can also be linked to objects stored in storage systems for non-structured data, like pictures or plans. This phase requires a strong business knowledge to identify the entirety of those relationships. This linking of data allows for in-depth analysis, facilitating informed decision-making in later phases of the project.

Phase 5 — Pre-dismantling planning

Once the information has been extracted, classified, standardized, and organized in a well-structured network of data, it is finally possible to define a precise and secure pre-dismantling scenario, without the risk of losing information or considering wrong assumptions. Authorization requests to access the site for the dismantling operations are prepared in an optimal way, based on solid and reliable information provided by the sequence of the phases previously seen and powered in large part by OCR and other AI technologies. This intelligent use of data makes it possible to minimize risks and optimize dismantling procedures.

OCR and AI technologies represent a major advance in the pre-decommissioning process of nuclear facilities. In some cases, tasks that might have taken weeks or months of manual effort could be completed in a fraction of the time with such technologies. Thanks to its ability to efficiently collect, extract and analyze key information, it enables a more precise, efficient, and secure approach to decommissioning projects. By automating tedious tasks, OCR and AI free up time and resources, while ensuring the highest quality of extracted data. Thanks to its potential for optimizing procedures and minimizing risks, this revolutionary technology opens new perspectives for the field of nuclear dismantling.

Use Case #2: Field interventions on the electrical network

Players in the energy sector are constantly looking for ways to improve the efficiency of their operations while ensuring the safety of their employees. OCR offers tremendous potential for streamlining field operations, especially when it comes to technicians working on the power grid, while increasing the safety of those workers.

Whenever a technician carries out an intervention on the electrical network, it is essential that he has accurate and up-to-date information regarding the components he is handling, because intervening on an electrical network can be dangerous. A technician’s human error can endanger his life, as well as the stability and reliability of the network. To do this, companies provide technicians with a work sheet containing details of equipment, component IDs, maintenance procedures, and more.

However, reality on the ground can be more complex than imagined and the intervention form itself does not prevent human errors. Given the complexity of the systems technicians need to deal with, it is even possible to identify and start working on the wrong piece of equipment. In this context, AI can be used to simplify the identification of the equipment and increase employee safety. For example, field technicians can be equipped with a mobile device application they could use before starting an intervention: the technician can take a picture of the elements he has in front of him and OCR algorithms can automatically extract critical information like the equipment identifier label to automatically verify the right equipment have been identified, thus reducing the risk of human errors and contributing to the overall safety of field operations.

The environmental conditions can degrade the readability of the characters of the identifiers, but from matching algorithms it is possible to find the correct identifier in the components at the reference databases. This kind of application can be used to guide even further the technicians during their interventions. Image segmentation could be applied to the picture to precisely identify relevant components and sub-components. In addition, pictures at the end of each intervention could be taken and automatically analyzed for extracting and storing information, ensuring better traceability.

In summary, involving AI and OCR techniques in the maintenance of the electrical network could bring several advantages:

Reduced Human Errors: By eliminating manual data entry and supporting technicians, OCR significantly reduces the risk of human error. A simple image can provide useful information for technicians.
Time saving: Time is a precious asset during field operations. The use of OCR allows technicians to save time by avoiding searching for the components on which they must intervene. This time saving allows technicians to focus on their main tasks, speeding up interventions and improving their overall efficiency.
Better Traceability: The traceability of components is crucial to guarantee the quality and safety of the electrical network. Data collected from the job sheet and photos taken by technicians can be stored centrally. This allows for accurate traceability of components and their maintenance history. In the event of problems or incidents, it is easier to identify the source of the problem and take corrective action.
Maintenance Optimization: Access to accurate and up-to-date data is essential to effectively plan and execute preventive maintenance. OCR makes it possible to maintain an up-to-date database of electrical network equipment and components during interventions. Companies can plan maintenance operations based on the actual condition of equipment, reducing unplanned shutdowns, and minimizing emergency repair costs.

Conclusion

At Sia Partners, our expertise extends beyond the realms of data and delves deep into the core of clients’ business needs. We understand that harnessing the power of OCR is not just about recognizing characters on an image: it’s about leveraging data and optimizing business processes. Our approach combines the mastery of data through the careful choice and integration of the right AI tools, tailored to the specific requirements of each use case. This, in turn, empowers us to deploy OCR solutions that not only meet clients’ immediate needs but also set the stage for a transformation into an industrial perspective, where efficiency and productivity flourish.

Yet, our added value extends far beyond data. We pride ourselves on our comprehensive understanding of business issues and the ability to rationalize processes. We recognize that adopting OCR technology is not just about the technical implementation but about ushering in a broader change within organizations. Our teams excel in change management, ensuring a smooth transition into the AI ecosystem. We work closely with our clients to understand the intricacies of their operations, providing solutions that address the specific challenges unique to their industry.

In essence, at Sia Partners, we are not just technology providers; we are partners in clients’ journey towards unlocking the full potential of OCR and, more broadly, AI, both in data and in business. We are dedicated to enhancing clients’ operational efficiency, reducing costs, and ultimately, driving their organization’s success.