Machine Learning with Firebase ML Kit

Machine Learning (ML) has become a critical topic in any application’s development. In a nutshell, ML means that you import data that “trains” an algorithm, and use it to generate a model. This trained model can then be used to solve problems that would be virtually impossible with traditional programming. To make this process more manageable, Firebase offers a service called ML Kit, an ML kit that we could define as “pre-built ML.” Among other functionalities, it contains text recognition, image labeling, face detection, and bar-code scanning. For functionalities that are not already provided by ML Kit, you can create your own custom model with TensorFlow Lite. Most of the services outlined in this tutorial can run both on the cloud or on your device. 

In this tutorial, you will start by taking a picture with your device, then you will use ML Kit to recognize text, barcodes, images, and faces, and identify a language. At the end of this tutorial, you will also be introduced to TensorFlow Lite, which allows you to build your own ML algorithms.

In particular, we will cover the following topics:

  • Using the device camera
  • Recognizing text from an image
  • Reading a barcode
  • Image labeling
  • Building a face detector and detecting facial gestures
  • Identifying a language
  • Using TensorFlow Lite

By the end of this tutorial, you will be able to leverage several Firebase services and create full-stack apps without any server-side code.

Using the device camera

In this recipe, you will use the Camera plugin to create a canvas for ML Kit’s vision models. The camera plugin is not exclusive to ML but is one of the prerequisites for ML visual functions that you will use in the following recipes in this tutorial.

By the end of this recipe, you will be able to use the device cameras (front and rear) to take pictures and use them in your apps.

Getting ready

For this recipe, you should create a new project and set up Firebase , Using Firebase, in the Configuring a Firebase app recipe. 

How to do it…

In this recipe, you will add the camera functionality to your app. The users will be able to take a picture with the front or rear camera of their device. Follow these steps:

  1. In the dependencies section of the project’s pubspec.yaml file, add the camera and path_provider packages: 
camera: ^0.8.1 
path_provider: ^2.0.1
  1. For Android, change the minimum Android SDK version to 21 or higher in your android/app/build.gradle file: 
minSdkVersion 21 
  1. For iOS, add the following instructions to the ios/Runner/Info.plist file: 
<key>NSCameraUsageDescription</key> 
<string>Enable MLApp to access your camera to capture your photo</string>
  1. In the lib folder of your project, add a new file and call it camera.dart.
  2. At the top of the camera.dart file, import material.dart and the camera package:
import 'package:flutter/material.dart';
import 'package:camera/camera.dart';
  1. Create a new stateful widget, calling it CameraScreen
class CameraScreen extends StatefulWidget { 
@override
_CameraScreenState createState() => _CameraScreenState();
}


class _CameraScreenState extends State<CameraScreen> {
@override
Widget build(BuildContext context) {
return Container(
);
}
}
  1. At the top of the _CameraScreenState class, declare the following variables: 
List<CameraDescription> cameras; 
List<Widget> cameraButtons;
CameraDescription activeCamera;
CameraController cameraController;
CameraPreview preview;
  1. At the bottom of the _CameraScreenState class, create a new asynchronous method, called listCameras, that returns a list of widgets: 
Future<List<Widget>> listCameras() async {} 
  1. In the listCameras method, call the availableCameras method, and based on the result of the call, return ElevatedButton widgets with the name of the camera, as shown: 
    List<Widget> buttons = [];
cameras = await availableCameras();
if (cameras == null) return null;
if (activeCamera == null) activeCamera = cameras.first;
if (cameras.length > 0) {
for (CameraDescription camera in cameras) {
buttons.add(ElevatedButton(
onPressed: () {
setState(() {
activeCamera = camera;
setCameraController();
});
},
child: Row(
children: [
Icon(Icons.camera_alt),
Text(camera == null ? '' : camera.name)
],
)));
}
return buttons;
} else {
return [];
}
  1. In the _CameraScreenState class, create a new asynchronous method, called setCameraController, that based on the value of activeCamera will set the CameraPreview preview variable, as shown: 
  Future setCameraController() async {
if (activeCamera == null) return;
cameraController = CameraController(activeCamera,
ResolutionPreset.high,);
await cameraController.initialize();
setState(() {
preview = CameraPreview(
cameraController,
);
});
}
  1. Under the setCameraController method, add another asynchronous method and call it takePicture. This will return an XFile, which is the result of the call to the takePicture method of the cameraController widget, as shown: 
Future takePicture() async {
if (!cameraController.value.isInitialized) {
return null;
}
if (cameraController.value.isTakingPicture) {
return null;
}
try {
await cameraController.setFlashMode(FlashMode.off);
XFile picture = await cameraController.takePicture();
Navigator.push(context,
MaterialPageRoute(builder: (context) =>
PictureScreen(picture)));
} catch (exception) {
print(exception.toString());
}
}
  1. Override the initState method. Inside it, set cameraButtons and call the setCameraController method, as shown here: 
@override
void initState() {
listCameras().then((result) {
setState(() {
cameraButtons = result;
setCameraController();
});
});
super.initState();
}
  1. Override the dispose method and inside it, dispose of the cameraController widget: 
 @override
void dispose() {
if (cameraController != null) {
cameraController.dispose();
}
super.dispose();
}
  1. In the build method, return a Scaffold with an AppBar whose title is Camera View, and a body with a Container widget: 
return Scaffold( 
appBar: AppBar(
title: Text('Camera View'),
),
body: Container());
  1. In Container, insert a Padding with an EdgeInsets.all value of 24 and a child of Column, as shown: 
Container( 
padding: EdgeInsets.all(24),
child: Column(
mainAxisAlignment: MainAxisAlignment.spaceAround,
children: [])
...
  1. In the children parameter of Column, add a row with cameraButtons, a Container with the camera preview, and another button that takes the picture using the camera:  
Row( 
mainAxisAlignment: MainAxisAlignment.spaceAround,
children: cameraButtons ?? [Container(child: Text('No cameras available'))],
),
Container(height: size.height / 2, child: preview) ?? Container(),
Row(
mainAxisAlignment: MainAxisAlignment.spaceEvenly,
children: [
ElevatedButton(
child: Text('Take Picture'),
onPressed: () {
if (cameraController != null) {
takePicture().then((dynamic picture) {
Navigator.push(
context,
MaterialPageRoute(builder: (context) =>
PictureScreen(picture)));
});
} }, ) ], ) ],
  1. In the lib folder of your project, create a new file, called picture.dart
  1. In the picture.dart file, import the following packages: 
import 'package:camera/camera.dart'; 
import 'package:flutter/material.dart';
import 'dart:io';
  1. In the picture.dart file, create a new stateful widget called PictureScreen
class PictureScreen extends StatefulWidget { 
@override
_PictureScreenState createState() => _PictureScreenState();
}

class _PictureScreenState extends State<PictureScreen> {
@override
Widget build(BuildContext context) {
return Container();
}
}
  1. At the top of the PictureScreen class, add a final XFile called picture and create a constructor method that sets its value: 
final XFile picture; 
PictureScreen(this.picture);
  1. In the build method of the _PictureScreenState class, retrieve the device’s height, then return a Scaffold containing a Column widget that shows the picture that was passed to the screen. Under the picture, also place a button that will later send the file to the relevant ML service, as shown here: 
double deviceHeight = MediaQuery.of(context).size.height; 
return Scaffold(
appBar: AppBar(
title: Text('Picture'),
),
body: Column(
mainAxisAlignment: MainAxisAlignment.spaceEvenly,
children: [
Text(widget.picture.path),
Container(height: deviceHeight / 1.5, child:
Image.file(File(widget.picture.path))),
Row(
children: [
ElevatedButton(
child: Text('Text Recognition'),
onPressed: () {},
) ], ) ], ), );
  1. Get back to the camera.dart file. In the onPressed function, in the Take Picture button, navigate to the PictureScreen widget, passing the picture that was taken, as shown. If the picture.dart file is not automatically imported by your IDE, also import the picture.dart file:   
ElevatedButton( 
child: Text('Take Picture'),
onPressed: () {
if (cameraController != null) {
takePicture().then((dynamic picture) {
Navigator.push(
context,
MaterialPageRoute(
builder: (context) => PictureScreen(picture)));
});
} }, )
  1. Back to the main.dart file, in the MyApp class, call the CameraScreen widget and set the title and theme of MaterialApp as shown. Also, remove all the code under MyApp
class MyApp extends StatelessWidget { 
@override
Widget build(BuildContext context) {
return MaterialApp(
title: 'Firebase Machine Learning',
theme: ThemeData(
primarySwatch: Colors.deepOrange,
),
home: CameraScreen(),
); } }
  1. Run the app. Choose one of the cameras in your device, then press the Take Picture button. You should see the picture you have taken, with the path of the file that was saved in your device, as shown in the following screenshot:  

How it works…

Being able to use the camera and adding pictures to your app is useful not only for ML but also for several other features you might want to add to your app. You can leverage the camera plugin to get a list of the available cameras in the device and take photos or videos.  

With the camera plugin, you get access to two useful objects:  

  • CameraController connects to a device’s camera and you use it to take pictures or videos.  
  • CameraDescription contains the properties of a camera device, including its name and orientation. 

Most devices have two cameras, one on the front (for selfies) and the other on the back, but some devices may only have one, and others more than two when they connect an external camera. That’s why in our code we created a dynamic List of CameraDescription objects to make the user choose the camera they want to use with the following instruction: 

cameras = await availableCameras(); 

The availableCameras method returns all the available cameras for the device in use and returns a Future value of List<CameraDescription>.  

In order to choose the camera, we called the cameraController constructor, passing the active camera with the following instruction: 

cameraController = CameraController(activeCamera, ResolutionPreset.veryHigh); 

Note that you can also choose the resolution of the picture with the ResolutionPreset enumerator; in this case, ResolutionPreset.veryHigh has a good resolution (a good resolution is recommended to use ML algorithms).  

The CameraController asynchronous takePicture method actually takes a picture; this will save the picture in a default path that we later show in the second screen of the app:

XFile picture = await cameraController.takePicture(); 

The takePicture method returns an XFile, which is a cross-platform file abstraction.  

Another important aspect to perform is overriding the dispose method of the _CameraScreenState class, which calls the dispose method of CameraController when the widget is disposed of.  

The second screen that you have built in this recipe is the PictureScreen widget. This shows the picture that was taken and shows its path to the user. Please note the following instruction: 

Image.file(File(widget.picture.path)) 

In order to use the picture in the app, you need to create a File, as you cannot use the XFile directly. 

Now your app can take pictures from the camera(s) in your device. This is a prerequisite for several of the remaining recipes of this tutorial.  

See also

While currently there is no way to apply real-time filters with the official camera plugin (for the issue, see https://github.com/flutter/flutter/issues/49531), there are several workarounds that allow obtaining the same effects with Flutter. For an example with opacity, for instance, see https://stackoverflow.com/questions/50347942/flutter-camera-overlay.

Recognizing text from an image

We’ll start with ML by incorporating ML Kit’s text recognizer. You will create a feature where you take a picture, and if there is some recognizable text in it, ML Kit will turn it into one or more strings.  

Getting ready

For this recipe, you should have completed the previous one: Using the device camera.

How to do it…

In this recipe, after taking a picture, you will add a text recognition feature. Follow these steps:

  1. Import the latest version of the firebase_ml_vision package in your pubspec.yaml file: 
firebase_ml_vision: ^0.10.0 
  1. Create a new file in the lib folder of your project and call it ml.dart.
  2. Inside the new file, import the dart:io and firebase_ml_vision packages: 
import 'dart:io'; 
import 'package:firebase_ml_vision/firebase_ml_vision.dart';
  1. Create a new class, calling it MLHelper
class MLHelper {} 
  1. In the MLHelper class, create a new async method, called textFromImage, that takes an image file and returns Future<String>
Future<String> textFromImage(File image) async { }
  1. In the textFromImage method, process the image with the ML Kit TextRecognizer and return the retrieved text, as shown: 
final FirebaseVision vision = FirebaseVision.instance; 
final FirebaseVisionImage visionImage = FirebaseVisionImage.fromFile(image);
TextRecognizer recognizer = vision.textRecognizer();
final results = await recognizer.processImage(visionImage);
return results.text;
  1. In the lib folder of your project, create a new file and call it result.dart.
  2. At the top of the result.dart file, import the material.dart package: 
import 'package:flutter/material.dart'; 
  1. Create a new stateful widget and call it ResultScreen
class ResultScreen extends StatefulWidget { 
@override
_ResultScreenState createState() => _ResultScreenState();
}


class _ResultScreenState extends State<ResultScreen> {
@override
Widget build(BuildContext context) {
return Container();
}
}
  1. At the top of the ResultScreen class, declare a final String, called result, and set it in the default constructor method: 
final String result; 
ResultScreen(this.result);
  1. In the build method of the _ResultScreenState class, return a Scaffold, and in its body, add a SelectableText, as shown: 
return Scaffold( 
appBar: AppBar(
title: Text('Result'),
),
body: Container(
child: Padding(
padding: EdgeInsets.all(24),
child: SelectableText(widget.result,
showCursor: true,
cursorColor: Theme.of(context).accentColor,
cursorWidth: 5,
toolbarOptions: ToolbarOptions(copy: true, selectAll:
true),
scrollPhysics: ClampingScrollPhysics(),
onTap: (){},
)),
),
);
  1. In the picture.dart file, in the onPressed function, in the Text Recognition button, add the following code: 
onPressed: () { 
MLHelper helper = MLHelper();
helper.textFromImage(image).then((result) {
Navigator.push(
context,
MaterialPageRoute(
builder: (context) => ResultScreen(result)));
});
},
  1. Run the app, select a camera on your device, and take a picture of some printed text. Then, press the Text Recognition button. You should see the text taken from your picture as shown in the following screenshot: 

How it works…

When using ML Kit, the process required to get results is usually the following: 

  1. You get an image.  
  2. You send it to the API to get some information about the image.  
  3. The ML Kit API returns data to the app, which can then use it as necessary. 

The first step is getting an instance of the Firebase ML Vision API. In this recipe, we got it with the following instruction:  

final FirebaseVision vision = FirebaseVision.instance; 

The next step is creating a FirebaseVisionImage, which is the image object used for the API detector. In our example, you created it with the following instruction: 

final FirebaseVisionImage visionImage = FirebaseVisionImage.fromFile(image); 

Once the FirebaseVision instance is available and FirebaseImageVision is available, you call a detector; in this case, you called a TextRecognizer detector with the following instruction: 

TextRecognizer recognizer = vision.textRecognizer(); 

To get the text from the image, you need to call the processImage method on TextRecognizer. This asynchronous method returns a VisionText object, which contains several pieces of information, including the text property, which contains all the text recognized in the image. You got the text with the following instructions: 

final results = await recognizer.processImage(visionImage); 
return results.text;

You then showed the text on another screen, but instead of just returning a Text widget, you used SelectableText. This widget allows users to select some text and copy it to other applications; you can actually choose which options should be shown to the user with the ToolBarOptions enum.  

Written by

XR Developer responsible for end-to-end development of XR solutions spanning multiple domains, by using various XR and WebXR libraries.

Leave a Reply