Artificial Intelligence (AI) has the potential to transform
manufacturing by improving shop floor processes such as production,
maintenance and quality. However, industrial datasets are notoriously
difficult to extract in a real-time, streaming fashion thus, negating
potential AI benefits. The main example is some specialized industrial
controllers that are operated by custom software which complicates
the process of connecting them to an Information Technology (IT)
based data acquisition network. Security concerns may also limit
direct physical access to these controllers for data acquisition.
To connect the Operational Technology (OT) data stored in these
controllers to an AI application in a secure, reliable and available
way, we propose a novel Industrial IoT (IIoT) solution in this paper.
In this solution, we demonstrate how video cameras can be installed
in a factory shop floor to continuously obtain images of the controller
HMIs. We propose image pre-processing to segment the HMI into
regions of streaming data and regions of fixed meta-data. We then
evaluate the performance of multiple Optical Character Recognition
(OCR) technologies such as Tesseract and Google vision to recognize
the streaming data and test it for typical factory HMIs and realistic
lighting conditions. Finally, we use the meta-data to match the OCR
output with the temporal, domain-dependent context of the data to
improve the accuracy of the output. Our IIoT solution enables reliable
and efficient data extraction which will improve the performance of
subsequent AI applications.
 A. Gilchrist, Industry 4.0: the industrial internet of things. Apress, 2016,
 J. Wan, S. Tang, Z. Shu, D. Li, S. Wang, M. Imran, and A.V. Vasilakos,
Software-defined industrial internet of things in the context of industry
4.0., IEEE Sensors Journal, vol 16, no. 20, 7373-7380, 2016.
 (Online). Available: https://data.oecd.org/lprdty/multifactor-productivity.
 R. Burke, A. Mussomeli, S. Laaper, M. Hartigan,
B. Sniderman, The Smart Factory: Responsive, Adaptive,
Connected Manufacturing, Deloitte, 2017. (Online). Available:
 V. Ohlsson, Optical Character and Symbol Recognition using Tesseract,
Dissertation, Department of Computer Science, Electrical and Space
Engineering, Lule˚a University of Technology, 2016.
 R. Smith, An Overview of the Tesseract OCR Engine, Ninth International
Conference on Document Analysis and Recognition (ICDAR 2007),
Parana, 2007, pp. 629-633.
 (Online). Available: https://cloud.google.com/vision/docs/ocr
 N. Otsu, A threshold selection method from gray-level histograms, IEEE
transactions on systems, man, and cybernetics, vol. 9, no. 1, pp.62-66,
 A. Dengel, R. Hoch, F. Hones, T. Jager, M. Malburg and
A. Weigel,CH:Techniques for Improving OCR Results , Handbook of
Character Recognition and Document Image Analysis, World Scientific
Publishing Company, 2007, pp. 227-258
 (Online). Available: https://grafana.com/