What exactly is Firebase?

Firebase is a platform for building mobile and web applications.

Introduction to Firebase

Firebase is a comprehensive platform for building web and mobile applications developed by Google. It offers a range of services that aid developers in various aspects of application development, including authentication, real-time database, cloud functions, hosting, and more. One of the key features that makes Firebase popular is its serverless architecture, which allows developers to focus on building features without managing the infrastructure.

Serverless Architecture in Firebase

What is Serverless Architecture?

Serverless architecture is a cloud computing execution model where the cloud provider dynamically manages the infrastructure. In the context of Firebase, serverless does not mean there are no servers; instead, it means that developers don’t have to worry about server management, scaling, or maintenance. Firebase abstracts away much of the traditional server-side infrastructure, allowing you to focus on writing code and deploying features.

Key Components of Firebase’s Serverless Architecture

  1. Cloud Firestore:

    • Firebase provides a NoSQL document database called Cloud Firestore. It’s designed to store and sync data in real-time between connected devices.
    • Good practice: Structure your data model efficiently to make the most out of Firestore’s capabilities. Use collections and documents wisely.
  2. Authentication:

    • Firebase Authentication simplifies the process of managing user authentication. It supports various authentication providers like email/password, Google, Facebook, etc.
    • Good practice: Implement secure authentication practices, such as using HTTPS and validating user inputs on the client and server sides.
  3. Cloud Functions:

    • Cloud Functions for Firebase is a serverless framework that allows you to run backend code in response to events triggered by Firebase features and HTTPS requests.
    • Good practice: Keep functions small, focused, and testable. Leverage the event-driven model for scalability.
  4. Hosting:

    • Firebase Hosting enables you to deploy web apps quickly and securely. It automatically provisions and configures SSL certificates for HTTPS.
    • Good practice: Optimize your web app for performance. Leverage caching and minification to reduce load times.
  5. Realtime Database:

    • Firebase Realtime Database is an older alternative to Cloud Firestore. It’s a JSON-like database that supports real-time data synchronization.
    • Good practice: Use Realtime Database for specific use cases where its features align with your application requirements.

What did we learn?

Firebase Project

All Firebase backend services are housed by a Firebase Project. When you visit the console you select a project and the services related to that project show their associated data. You can own multiple projects for multiple apps.

Firebase App

To connect out to a Firebase Project we need create or use the configuration we call a Firebase App. A project can have multiple apps depending on the types of platforms it needs to support. If you’re just building a web app, then you only needs one app. If you’re going to build a cross platform app, then you’ll need an app for each platform.

JavaScript SDK

Firebase comes with a JavaScript SDK that handles a lot for you. You don’t have to write any networking or caching code. The JavaScript uses the configuration of a Firebase App to know what backend service to connect to.

Firebase CLI

Firebase also comes with a CLI that can perform many actions. In this example we used it to deploy to Firebase Hosting. Stay tuned though, we’re about to use the Firebase CLI for one of the most useful bits of Firebase development, the Emulator Suite.

Cloud firestore

Realtime streams, advanced querying, and when to denormalize.

Up until this point we’ve been covering Firebase from a general point of view. From here on out, we’ll be getting really specific and focusing on individual products and their features.

We saw that with security it’s really important to have authentication figured out. But! We’re going to start with the database, because well that’s the fun part.

What is Firestore?

Firestore is a NoSQL document database, with realtime and offline capabilities. Firestore is designed to reduce if not eliminate slow queries. In fact, if you manage to write a slow query with Firestore it’s likely because you’re downloading too many documents at once. With Firestore you can query for data, and receive updates within 500ms every time a data updates within the database.

Firestore is extremely powerful, but there’s a bit of a perspective shift if you’re coming from a SQL background.

SQL and NoSQL

Many developers come to NoSQL with at least some SQL experience. They are used to a world of schemas, normalization, joins, and rich querying features. NoSQL tends to be a bit of jarring experience at first because it has different priorities. SQL prioritizes having data in a uniform and distinct model.

This data model is built through tables. You can think of tables in two dimensions: rows and columns. Rows are a single record or object within the table. Columns are the properties on that object. Columns provide a rigid but high level of integrity to your data structure.

This table has uniform schema that all records must follow. This gives us a high amount of integrity within the data model at the cost of making variants of this record. You can’t add a single column onto a single row. If a new column is created, every row gets that column, even if the value is just null. Let’s say we wanted to add another column to this tbl_expenses table: approval.

Expenses can be personal so this column may not make sense for every column. For those cases how do we fill the column? Do we make it false? Well, that might communicate that the manager didn’t approve a personal charge and it could also accidentally end up a query looking for all unapproved expenses. How about making it null? This value is supposed to be a boolean, but if we make it nullable, it can have three states. If you can have three states, is it really a boolean?

Usually in those cases you would have to create another table such as: tbl_business_expenses. Now, to get a list of personal and buisness expenses back you’d have to write a query.

-- Union expenses with business expenses
SELECT id, cost, date, uid
  FROM tbl_expenses
  WHERE uid = 'david_123'
UNION
SELECT id, cost, date, uid
  FROM tbl_business_expenses
  WHERE uid = 'david_123'

The advantage here is that the data integrity is high, however it’s at the cost the read time. Anytime you have to write a JOIN or some other clause you are tacking on the time to complete the query. In simple queries this isn’t a big deal, but as data models grow more complicated the queries do as well. If you aren’t careful, you can end up with a 7 way INNER JOIN and just to get a list of expenses.

A query like that can be rather slow. If this kind of query is one of the most important aspects of your site, it needs to be fast, even as more and more records are added to the database. As the database needs to scale, you’ll need to put it on beefier and beefier machines, this is known as scaling vertically.

Reads over Writes

NoSQL databases don’t care as much about uniformity and definitely not as much about having a distinct model. What Firestore priorities is fast reads. In many applications it’s common to have more reads than writes. Take a second and think about some of the most common apps and sites in your life. Now think about how much more you consume the content from them versus write updates to them. Even if you are an avid poster, it’s still likely that you scroll through your feed more.

In the Firestore world, we’re going to be shifting away from data normalization and strong distinctive models. What we’ll get in return are blazing fast queries with realtime streaming. In addition, as your database scales up to support more data it can be distributed across several machines to handle the work behind the scenes. This concept is known as scaling horizontally and it’s how Firestore scales up automatically for you.

With all that out of the way, let’s see how you store and model data in Firestore.

Tables -> Collections

If SQL databases use tables to structure data, what do NoSQL database use? Well, in Firestore’s case data is stored in a hierarchical folder like structure using collections and documents.

Diagram of two documents in an expenses collection with different schemas

Collections are really just a concept for documents all stored at a similar path name. All data within Firestore is stored in documents.

Rows -> Documents

Firestore consists of documents, which are like objects. They’re not just basic JSON objects, they can store complex types of data like binary data, references to other documents, timestamps, geo-coordinates, and so on and so forth.

Now in SQL all rows had to have the same columns. But that’s not the case in NoSQL every document can have whatever structure it wants.

Each document can be totally different from the other if you want. That’s usually not the case in practice, but you have total flexibility at the database level. You can lock down the schema with security rules, but we’ll get into that later.

Retrieving data

With SQL you think about retrieving data in terms of queries. While that is still true here, you should primarily think about data in terms of locations with path names. In the JavaScript SDK we call this a reference.

References

Documents and collections are identified by a unique path, to refer to this location you create a reference.

import { collection, doc } from "firebase/firestore"
 
const usersCollectionReference = collection(db, "users")
// or for short
const usersCol = collection(db, "users")
// get a single user
const userDoc = doc(db, "users/david")

Both of those references will allow you to get all of the data at that location. For collections, we can query to restrict the result set down a bit more. For single documents, you retrieve the whole document so there’s no querying needed.

Using the Emulator

Firestore has a fully functional offline emulator for development and CI/CD environments. In the last section we set up our project to run against the emulators. But as a review, you setup and run the emulators with the Firebase CLI and then use the JavaScript SDK to connect out to the emulator.

import { initializeApp } from "firebase/app"
import { getFirestore, connectFirestoreEmulator } from "firebase/firestore"
import { config } from "./config"
 
const firebaseApp = initializeApp(config.firebase)
const firestore = getFirestore(firebaseApp)
if (location.hostname === "localhost") {
  connectFirestoreEmulator(firestore, "localhost", 8080)
}

Whenever you are on running on localhost your app won’t connect out to production services.

onSnapshot()

With a reference made, we have a decision to make. Do we want to get the data one time, or do we want the realtime synchronized data state every time there’s an update at that location? The realtime option sounds fun, so let’s do that first.

import { onSnapshot } from "firebase/firestore"
 
onSnapshot(userDoc, (snapshot) => {
  console.log(snapshot)
})
 
onSnapshot(usersCol, (snapshot) => {
  console.log(snapshot)
})

The onSnapshot() function takes in either a collection or document reference. It returns the state of the data in the callback function and will fire again for any updates that occur at that location. What you notice too is that it doesn’t return the data directly. It returns an object called a snapshot. A snapshot is an object that contains the data and a lot of important information about its state as well. We’ll get into the other info in a bit, but to get the actual data, you tap into the data function.

onSnapshot(userDoc, (snapshot) => {
  // this one one doc
  console.log(snapshot)
  // this is the data
  console.log(snapshot.data())
})
 
onSnapshot(usersCol, (snapshot) => {
  // this is an array of docs
  console.log(snapshot.docs)
  // you can iterate through and map what you need
  console.log(snapshot.docs.map((d) => d.data()))
})

🔥 What makes this realtime? Well, whenever an update occurs at this location it will re-fire this callback. For example:

onSnapshot(userDoc, (snapshot) => {
  console.log(snapshot.data())
  // First time: "David"
  // Second time: "David!"
})
 
updateDoc(userDoc, { name: "David!" })

😎 This isn’t just local, this callback fires across all connected devices.

Writing data

Speaking of updates. What we see right here is one of the several update functions or as we call them mutation functions.

setDoc()

In Firestore calling setDoc() is considered a “destructive” operation. It will overwrite any data at that given path with the new data provided.

const davidDoc = doc(firestore, "users/david_123")
setDoc(davidDoc, { name: "David" })

For the instances where you want to granularly update a document, you can use another function.

updateDoc()

The updateDoc() function can take in a partial set of data and apply that update to an existing document.

const davidDoc = doc(firestore, "users/david_123")
updateDoc(davidDoc, { highscore: 82 })

It’s important to note that updateDoc() will fail if the document does not already exist. In the case where you can’t be certain if a document exists but you don’t want to perform a desctructive set, you can merge the update.

const newDoc = doc(firestore, "users/new_user_maybe")
setDoc(newDoc, { name: "Darla" }, { merge: true })

This will create a new document if needed and if the document exists, it will only update with the data provided. It’s the best of both worlds.

deleteDoc()

Deleting a document is fairly straighfoward.

const davidDoc = doc(firestore, "users/david_123")
deleteDoc(davidDoc)
Generating Ids

Now that’s for single documents. What about adding new items to a collection? Do you have to think of a new ID every time?

// Don't want to have to name everthing? Good!
const someDoc = doc(firestore, "users/some-name")
const usersCol = collection(firestore, "users")
// Generated IDs!
addDoc(usersCol, { name: "Darla" })

The addDoc() function will create a document reference behind the scenes, assign it a generated id, and then data is sent to the server. What if you need access to this generated ID before you send its data off to the server?

// The ids are generated locally as well
const newDoc = doc(userCol)
console.log(newDoc.id) // generated id, no data sent to the server
setDoc(newDoc, { name: "Fiona" }) // Now it's sent to the server

Generated Ids in Firestore are created on the client. By creating an empty named child document reference from a collection reference, it will automatically assign it an generated id. From there you can use that id for whatever you need before sending data up to the server.

Timestamps

One of the tricky aspects with client devices are dates and timestamps. Firestore allows you to set dates on documents.

const newDoc = doc(firestore, "marathon_results/david_123")
// Imagine this runs automatically when a runner crosses the finish line
setDoc(newDoc, {
  name: "David",
  finishedAt: new Date(), // Something like: '5/1/2022 9:32:12 EDT'
})

In this example, this user document is added with an finishedAt field set to the device date. But it’s only the current date based on that machine. Imagine if this was an app one the user’s phone that ran this code when they crossed the finish line. The user could change their device settings and set the world record if they wanted to. Instead, we rarely use local dates for timestamps and we use value to tell Firestore to apply a time on the server.

const newDoc = doc(firestore, "marathon_results/david_123")
// Imagine this runs automatically when a runner crosses the finish line
setDoc(newDoc, {
  name: "David",
  finishedAt: serverTimestamp(),
})

This function acts as a placeholder on the client that tells Firestore to apply the time when reaching the server.

Incrementing values

One of the most common tasks with data is just simply incrementing and decrementing values. The math is simple, but the edge cases in a realtime system can be really tricky.

  • 🔥 You first need to know the state of the data
  • 🔥 Then you need to add or substract from the value
  • 🔥 Then you update the new score

But what happens if that value was updated during that process? The new value will likely be wrong. Firestore has a few ways of handling these types of updates, but the most convienent are the increment() and decrement() functions.

const davidScoreDoc = doc(firestore, "scores/david_123")
const darlaScoreDoc = doc(firestore, "scores/darla_999")
 
updateDoc(davidScoreDoc, {
  score: decrement(10),
})
updateDoc(darlaScoreDoc, {
  score: increment(100),
})

The functions ensure that the incrementing and decrementing operations happen the latest value in the database. It’s important to note that these functions can only work reliably if ran at most once per second. If you need updates faster than that you can use a solution called a distributed counter. But that’s for another class.

Now there’s one thing you should notice here. We’re making updates to the server, but nowhere are we awaiting the result of the update. It’s still an async operation, but why aren’t we awaiting the result?