Format and adjust data

Format and adjust data

Check and change data type

Check your data for inconsistent units of measurement to prevent problems during data analysis.

Convert temperatures from Fahrenheit to Celsius

Use the CONVERT function to change units of measurement.

Convert function formula as =CONVERT(CELL, from Fahrenheit to Celsius. Enter "F", "C").

=CONVERT(B2, "F", “C”)

Converting the wind speed from miles per hour (mph) to meters per second (m/s) using CONVERT.  Use : =CONVERT(CELL, "mph", "m/s").

Lock data in a table

Utilizing functions to transform data can create issues that data professionals need to address. For instance, when a reference value alters, the associated calculated value gets modified as well. By converting a function to a value and locking the data in a table, we can ensure consistency of a cell even amidst changes in surrounding data.

Data validation

  • Add dropdown lists with predetermined options
  • Create custom checkboxes
  • Protect structured data and formulas

  1. Data validation in spreadsheets allows you to control what can and can't be entered in your worksheet. This can help cut down on how much data cleaning you have to do later on.
  2. Data validation can be used to add drop-down lists to cells with predetermined options for users to choose from, making it easier for collaborators to interact with your table.
  3. You can use data validation to create custom checkboxes.
  4. Data validation can be used to protect structured data and formulas, especially in a spreadsheet where multiple people are working together.
  5. Data validation can help your team track progress, protect your tables from breaking, and help you customize tables to your needs.

Conditional formatting

  • Conditional formatting is a spreadsheet tool that changes how cells appear when values meet specific conditions. This adds visual cues to spreadsheets, making it easier to understand the table at a glance.
  • Combining conditional formatting with data validation can create custom tools for spreadsheets.
  • Different rules can be set to apply to cells, such as "Format Cells if... Text is exactly", which can be used to color-code drop-down menus with specific text.
  • Conditional formatting can be used to track the status of different tasks, making it easy to see how many tasks are in progress or completed.
  • The "Date is after" rule can be used in combination with data validation to track upcoming deadlines.
  • Spreadsheet programs like Excel have built-in color codes that can be used.
  • Data validation and conditional formatting can create custom tools and visual cues that make your information easy to understand.

Transform data with SQL

Data analysts often need to convert data formats to complete their analyses. This requirement applies even when using SQL instead of a spreadsheet. SQL, like spreadsheets, follows standard rules for data type conversion. You may wonder why data transformation is crucial for a data analyst. Consider it akin to a driver capable of changing a flat tire. Being able to transform data into the correct format expedites your analysis. You won't need to wait for someone else to do the conversion for you.

Common conversions

The following table summarizes some of the more common conversions made with the CAST function. Refer to Conversion Rules in Standard SQL for a full list of functions and associated rules.

Starting with CAST function can convert to:
Numeric (number)
  • Integer
  • Numeric (number)
  • Big number
  • Floating integer
  • String
String
  • Boolean
  • Integer
  • Numeric (number)
  • Big number
  • Floating integer
  • String
  • Bytes
  • Date
  • Date time
  • Time
  • Timestamp
Date
  • String
  • Date
  • Date time
  • Timestamp

The CAST function (syntax and examples)

CAST is an American National Standards Institute (ANSI) function used in lots of programming languages, including BigQuery. This section provides the BigQuery syntax and examples of converting the data types in the first column of the previous table. The syntax for the CAST function is as follows:

CAST(expression AS typename)

Where expression is the data to be converted and typename is the data type to be returned.

Converting a number to a string

The following CAST statement returns a string from a numeric identified by the variable MyCount in the table called MyTable.

SELECT CAST(MyCount AS STRING) FROM MyTable

In the above SQL statement, the following occurs:

  • SELECT indicates that you will be selecting data from a table
  • CAST indicates that you will be converting the data you select to a different data type
  • AS comes before and identifies the data type which you are casting to
  • STRING indicates that you are converting the data to a string
  • FROM indicates which table you are selecting the data from

Converting a string to a number

The following CAST statement returns an integer from a string identified by the variable MyVarcharCol in the table called MyTable. (An integer is any whole number.)

SELECT CAST(MyVarcharCol AS INT) FROM MyTable

In the above SQL statement, the following occurs:

  • SELECT indicates that you will be selecting data from a table
  • CAST indicates that you will be converting the data you select to a different data type
  • AS comes before and identifies the data type which you are casting to
  • INT indicates that you are converting the data to an integer
  • FROM indicates which table you are selecting the data from

Converting a date to a string

The following CAST statement returns a string from a date identified by the variable MyDate in the table called MyTable.

SELECT CAST(MyDate AS STRING) FROM MyTable

In the above SQL statement, the following occurs:

  • SELECT indicates that you will be selecting data from a table
  • CAST indicates that you will be converting the data you select to a different data type
  • AS comes before and identifies the data type which you are casting to
  • STRING indicates that you are converting the data to a string
  • FROM indicates which table you are selecting the data from

Converting a date to a datetime

Datetime values have the format of YYYY-MM-DD hh: mm: ss format, so date and time are retained together. The following CAST statement returns a datetime value from a date.

SELECT CAST (MyDate AS DATETIME) FROM MyTable

In the above SQL statement, the following occurs:

  • SELECT indicates that you will be selecting data from a table
  • CAST indicates that you will be converting the data you select to a different data type
  • AS comes before and identifies the data type which you are casting to
  • DATETIME indicates that you are converting the data to a datetime value
  • FROM indicates which table you are selecting the data from

The SAFE_CAST function

Using the CAST function in a query that fails returns an error in BigQuery. To avoid errors in the event of a failed query, use the SAFE_CAST function instead. The SAFE_CAST function returns a value of Null instead of an error when a query fails.

The syntax for SAFE_CAST is the same as for CAST. Simply substitute the function directly in your queries. The following SAFE_CAST statement returns a string from a date.

SELECT SAFE_CAST(MyDate AS STRING) FROM MyTable

More information

Browse these resources for more information about data conversion using other SQL dialects (instead of BigQuery):

Import and combine data in spreadsheets and databases

Import data

As a data analyst, you often need to import data from one file or location to another. Both spreadsheets and SQL offer features that facilitate data import.

Import data in spreadsheets

=IMPORTRANGE(spreadsheet_url, range_string)

Import data in SQL

INSERT INTO [destination_table_name]
SELECT [column names, separated by commas, or * for all columns]
FROM [source_table_name]
WHERE [condition]

Combine data

Combine data in spreadsheets

=CONCATENATE(item 1, item 2)

=CONCATENATE(item 1, " ", item 2). with space

Combine data in SQL

SELECT CONCAT(field1, " ", field2)
FROM [table_name]

combined fields an alias by using AS:

SELECT CONCAT(field1, " ", field2) AS alias
FROM [table_name]

You can import and combine data in both spreadsheets and SQL databases. To import data into a spreadsheet, utilize the IMPORTRANGE function. For importing data into a SQL table, use the INSERT INTO, SELECT, and WHERE commands. To combine two or more data strings in spreadsheets, use the CONCATENATE function. In SQL, the CONCAT function allows you to combine fields.

Strings in spreadsheets

The LEN function

The LEN function calculates a string’s length. Use this formula to check the length of the datetime strings in column C. =LEN(B2)

The FIND function

The FIND function locates specific characters and substrings in a string. Use the FIND function to determine where in the string located something like a space. =FIND(" ",B2)

Note: FIND is case sensitive, so always make sure you input the substring correctly.

The RIGHT function

Use the RIGHT function to select a specific number of characters on the right side of a cell. =RIGHT(B2,8)

The LEFT function

Use the LEFT function to select a specific number of characters on the left side of a cell. =LEFT(B2,8)

Advanced spreadsheet tips and tricks

Google Sheets

  • Keyboard shortcuts for Google Sheets: This is a great resource for quickly learning a range of keyboard shortcuts that can make regular tasks quicker and easier, like navigating your spreadsheet or accessing formulas and functions. This list contains shortcuts for the desktop and mobile versions of Google Sheets so that you can apply them to your work no matter what device you are using.
  • List of Google Sheets Functions: This is a comprehensive list of the Google Sheets functions and syntax. Each function is listed with a link to learn more.
  • 23 Must-Know Google Sheet Formulas: This blog article from Layer summarizes and describes 20 of the most useful Google Sheets formulas.
  • 18 Google Sheets Formula Tips and Techniques: These are tips for using Google Sheets shortcuts when working with formulas.

Excel

  • Keyboard shortcuts in Excel: Earlier in this list, you were provided with a resource for keyboard shortcuts in Google Sheets. Similarly, this resource provides a list of keyboard shortcuts in Excel that will make performing regular spreadsheet tasks more efficient. This includes keyboard shortcuts for both desktop and mobile versions of Excel, so you can apply them no matter what platform you are working on.
  • 222 Excel shortcuts: A compilation of shortcuts includes links to more detailed explanations about how to use them. This is a great way to quickly reference keyboard shortcuts. The list has been organized by functionality, so you can go directly to the sections that are most useful to you.
  • List of spreadsheet functions: This is a comprehensive list of Excel spreadsheet functions with links to more detailed explanations. This is a useful resource to save so that you can reference it often; that way, you’ll have access to functions and examples that you can apply to your work.
  • List of spreadsheet formulas: Similar to the previous resource, this comprehensive list of Excel spreadsheet formulas with links to more detailed explanations and can be saved and referenced any time you need to check out a formula for your analysis.
  • Essential Excel Skills for Analyzing Data: This blog post includes more advanced functionalities of some spreadsheet tools that you have previously learned about, like pivot tables and conditional formatting. These skills have been identified as particularly useful for data analysis. Each section includes a how-to video that will take you through the process of using these functions step-by-step, so that you can apply them to your own analysis.
  • Advanced Spreadsheet Skills: Mark Jhon C. Oxillo’s presentation starts with a basic overview of spreadsheet but also includes advanced functions and exercises to help you apply formulas to actual data in Excel. This is a great way to review some basic concepts and practice the skills you have been learning so far.

There are lots of resources online about advanced spreadsheet tips and tricks. You'll probably discover new resources and tools on your own, but this list is a great starting point as you become more familiar with spreadsheets.

All about the analyze stage

  • The analyze stage in data analytics is crucial for understanding data and predicting future trends.
  • This stage involves cleaning and preparing data, understanding averages and medians of the data.
  • Tools like SQL and spreadsheets are essential for sorting, filtering, and formatting data.
  • The ability to explain findings confidently to an audience is a key skill.
  • The goal is to tell a story with the data, highlighting trends and predicting future outcomes.

Running into challenges? Not to worry!

  1. Great data analysts often rely on outside resources and consider it empowering and a source of new knowledge for the future.
  2. Using thinking skills like analytical, mathematical, and structured thinking can help approach a problem logically and break it down into smaller parts.
  3. Using the right data analytics terms while searching for solutions online can lead to more search results and better understanding of what other analysts are saying.
  4. Familiarity with basic tools is necessary to understand and apply new functions found online.
  5. Understanding the syntax of formulas and functions for different tools allows you to modify example code to fit your own needs.
  6. Online resources can be used to build new solutions to problems faced during data analysis.