Simple Data Validation in SQL (2023)

What is data validation in SQL and why is it important? Data validation is a method for checking the accuracy and quality of data. Information in databases is constantly being updated, deleted, queried, or moved by multiple people or processes, so ensuring that data is valid at all times is essential. In this article, we’re going to explain how to add some simple validation rules in SQL. We’ll also look at how a product like the SQL Spreads Excel Add-In can make this process a little easier.

Table of contents

• What is Data Validation in SQL?
• Constraints in SQL
– NOT NULL Constraint
– UNIQUE Constraint
– CHECK Constraint
• How to Add Validation in SQL Spreads
• Summary

What is Data Validation in SQL?

Data validation is the method for checking the accuracy and quality of data. It is often performed prior to adding, updating, or processing data. Similarly, when we want to merge data from disparate sources we often talk of ‘cleansing’ the data – in other words validating it. When validating data, we can check if the data is:

  • complete (ie no blank or null values)
  • unique (ie no duplicate values)
  • consistent with what we expect (eg a decimal between a certain range)

Some examples of how we’d use these checks include:

  • in a payroll table, we may want to set minimum and maximums for the ‘salary’ column to ensure values entered are within acceptable bounds.
  • in a customer table, when storing address details, we’d need to ensure that the zip code is numeric and the correct length.
  • in a product table, we may want to limit the values in the ‘color’ column to a defined list.

Some of you may be saying to yourself “data validation like this should be performed in the application layer, not the database layer”. Of course, validation in the application layer is crucial. But there are instances where updates could be performed directly in the database. It is also good practice to make sure that the database has some form of data validation even it also exists elsewhere.

Although it may sound obvious, we should also point out that although we can use rules to validate data entry, we can’t necessarily ensure the correctness of a value. For example, we may have a validation rule to limit entries in a salary column to values between 5000 and 250,000. This still wouldn’t prevent an incorrect value from being mistakenly entered (even if it falls within the allowable range).

There are some other constraints we can apply during validation, which are slightly more advanced. These include checking if a link between two tables will be destroyed (ie a Foreign Key constraint), setting a default value if none is provided, and also setting a primary key constraint.

We’re going to take a look in more detail at the three constraints we’re most likely to use to validate our data – the NOT NULL, UNIQUE, and CHECK constraints.

Constraints in SQL

Constraints in SQL Server are rules that limit the data that goes into our tables. These constraints ensure the validity of the data and help maintain the integrity of the database. Constraints can be defined when tables are created or afterward and can apply to individual or multiple columns.

If we insert data in the column that meets the constraint rule criteria, SQL Server inserts data successfully. However, if data violates the constraint, the insert statement is aborted with an error message.

NOT NULL Constraint

SQL Server allows NULL values, which translate to ‘unknown value’. There are valid use cases for using NULLS, but there are also obvious cases where we can’t accept a NULL value. In these cases, we can define a NOT NULL constraint on a column.

In the example below, we are creating an employees table, and specifying that all columns except ‘MiddleName’ will not accept NULLs.

CREATE TABLE Employees(EmployeeID INT NOT NULL,FirstName Varchar(100) NOT NULL,MiddleName Varchar(50) NULL,LastName Varchar(100) NOT NULL,Gender char(1) NOT NULL,Address Varchar(200) NOT NULL);

If we need to update existing columns to be NOT NULL, then we can either use an ALTER TABLE statement. Alternatively, we can use SQL Server Management Studio (SSMS) – right-click on the table and click ‘Design’ to make the changes.

Simple Data Validation in SQL (1)

UNIQUE Constraint

We typically use the UNIQUE constraint on ID columns. In the example below, we’re creating a simple table and specifying that that ‘EmployeeID’ should be unique (and not NULL).

CREATE TABLE Employees(EmployeeID INT NOT NULL UNIQUE,FirstName Varchar(100) NOT NULL,MiddleName Varchar(50) NULL,LastName Varchar(100) NOT NULL,Gender char(1) NOT NULL,Address Varchar(200) NOT NULL);

CHECK Constraint

A check constraint consists of a logical expression to determine what values are valid. A simple example would be in a payroll database where we want to specify a maximum value that can be entered. The syntax for the CHECK constraint when creating a table is shown below.

CREATE TABLE table_name(column1 datatype [ NULL | NOT NULL ],column2 datatype [ NULL | NOT NULL ],...CONSTRAINT constraint_nameCHECK (column_name condition));

So, for our payroll example, we could use the following to create a table that enforces a check constraint on values entered into the Salary column:

CREATE TABLE dbo.Payroll(EmployeeID int PRIMARY KEY,EmployeeType INT,Salary decimal(9,2),CONSTRAINT CK_Payroll_Salary_1 CHECK (EmployeeType = 1 and Salary > 0 and Salary < 200000.00));

If we need to add a check constraint to a column in an existing table, we can use the ALTER statement:

ALTER TABLE dbo.PayrollADD CONSTRAINT CK_Payroll_Salary_1CHECK (EmployeeType = 1 and Salary > 0 and Salary < 200000.00);

And to remove a check constraint we can use the following:

ALTER TABLE dbo.PayrollDROP CONSTRAINT CK_Payroll_Salary_1;

Finally, it’s often useful to temporarily enable or disable check constraints, and we can do this as follows: To enable a check constraint:

ALTER TABLE dbo.PayrollWITH CHECK CHECK CONSTRAINT CK_Payroll_Salary_1;

To disable a check constraint:

ALTER TABLE dbo.PayrollNOCHECK CONSTRAINT CK_Payroll_Salary_1;

As you can see, Check Constraints are easy to create and flexible in terms of usage.

How to Add Validation in SQL Spreads

The SQL Spreads Excel Add-in makes it easy to update data in SQL Server from Excel.It also makes it easy to add SQL data validation rules from within Excel using a data post-processing script.

The Data Post-processing script is a SQL query that will be executed as the last step in the transaction that updates the database with the changes made in Excel. By default, SQL Spreads will always validate all entered values against the Data Types in SQL Server. The process described below is a more advanced option that is similar to the CHECK constraint we described earlier.

To add a Data Post-processing script in SQL Spreads, open Document Settings and click the Edit Post-Save SQL Query button.

Simple Data Validation in SQL (2)

In the Post-Save SQL Query dialog box, we can now enter our validation script. The validation script contains the logic that we want to check and an error message that we can display to the user.

The example below shows a SQL query that checks if any of the values entered in the Salary column of the Payroll table are invalid (ie less than or equal to 0 or greater than 200000 for Employees of type 1).

IF (SELECT COUNT([Salary]) FROM [demo].[dbo].[Payroll] WHERE EmployeeType = 1 AND (Salary <= 0 OR Salary > 200000.00)) > 0 RAISERROR('The Salary for Employee_Type 1 must be between 0 and 200000',16,1);

Simple Data Validation in SQL (3)

Now, if the user enters an invalid value in the Salary column when trying to save to the database, the following message will be displayed, and the update transaction will be rolled back.

Simple Data Validation in SQL (4)

Summary

Performing data validation in SQL is important to maintain the integrity of your database. In this article, we’ve explained some simple ways for you to do this using constraint scripts in SQL Server Management Studio.

We also showed you how you can add validation checks from within Excel, using the Data Post-processing feature in SQL Spreads.

If you do a lot of updating of data from Excel to SQL Server, or if you have users who love Excel but are not used to database tools, then download the free trial of the SQL Spreads Excel Add-In.

Simple Data Validation in SQL (5)

Article by

Andy McDonald

Andy has worked 20+ years in the Engineering, Financial, and IT sectors with data analysis and presentation using tools such as SQL Server, Excel, Power Query and Power BI.

Writes for SQL Spreads about Excel and SQL Server and how to tie those two together.

Leave a Reply


Comments (1)

Reply

Simple Data Validation in SQL (6)

Bhanupratap » 18. Jul, 2022

Great Content, currently I am exploring SQL Server, your blog help me a lot, thank you.

FAQs

What are the 4 ways to validate a data from database? ›

Types of Data Validation
  1. Data Type Check. A data type check confirms that the data entered has the correct data type. ...
  2. Code Check. A code check ensures that a field is selected from a valid list of values or follows certain formatting rules. ...
  3. Range Check. ...
  4. Format Check. ...
  5. Consistency Check. ...
  6. Uniqueness Check.
Apr 15, 2021

How to validate data from 2 tables in SQL? ›

Below are some of the methods you can use to compare two tables in SQL.
  1. Compare Two Tables using UNION ALL. ...
  2. Compare Two Table using MINUS. ...
  3. Compare Two Table using JOIN. ...
  4. Compare Two Table using NOT EXISTS. ...
  5. Compare Cells From Two Tables – Cell by Cell Validation. ...
  6. Get Matched and Unmatched Count from Two Tables.
Feb 14, 2022

How to validate data type of column in SQL? ›

You can get the MySQL table columns data type with the help of “information_schema. columns”. SELECT DATA_TYPE from INFORMATION_SCHEMA. COLUMNS where table_schema = 'yourDatabaseName' and table_name = 'yourTableName'.

What are the 4 types of validation? ›

  • A) Prospective validation (or premarket validation)
  • B) Retrospective validation.
  • C) Concurrent validation.
  • D) Revalidation.
Jul 17, 2017

Which is the best approach to validate data? ›

The best way to ensure the high data quality of your datasets is to perform up-front data validation. Check the accuracy and completeness of collected data before you add it to your data warehouse. This will increase the time you need to integrate new data sources into your data warehouse.

What are the two methods of data validation? ›

Data Type Validation: This technique checks if the data entered into the system is of the correct data type, such as a string, integer, or date. Range Validation: This technique checks if the data entered into the system falls within a specific range of values, such as a customer's age between 18 and 65 years old.

Which are used to perform in data validation in SQL? ›

To add a Data Post-processing script in SQL Spreads, open Document Settings and click the Edit Post-Save SQL Query button. In the Post-Save SQL Query dialog box, we can now enter our validation script. The validation script contains the logic that we want to check and an error message that we can display to the user.

What is an example of data validation in a database? ›

Validation is the name given to the process whereby the information entered in the database is checked to ensure that it makes sense. For example, you can use validation to check that only numbers between 0 and 100 are entered in a percentage field, or only Male or Female is entered in a sex field.

How do you ensure data is valid? ›

The most crucial step in ensuring data validity is to establish a clear and consistent data collection process. This should include defining data sources, determining data types and formats, and establishing procedures for data entry and verification.

How to check same values in 2 columns in SQL? ›

How to Find Duplicate Values in SQL
  1. Using the GROUP BY clause to group all rows by the target column(s) – i.e. the column(s) you want to check for duplicate values on.
  2. Using the COUNT function in the HAVING clause to check if any of the groups have more than 1 entry; those would be the duplicate values.
Sep 2, 2020

How to compare two data in SQL? ›

Compare Database Data
  1. From the main menu, go to Tools -> SQL Server -> New Data Comparison. ...
  2. Identify the source and target databases. ...
  3. Select the check boxes for the tables and views that you want to compare.
Mar 3, 2023

What is the commonly used SQL command to check data quality? ›

Use the COUNT function

SELECT COUNT([column_name]) FROM [table_name]; This query returns the number of rows in the table_name. You can compare this count to the expected number of rows to determine if there is any missing data.

How do you perform data validation and data cleaning in SQL? ›

How to leverage SQL Transformations for data validation and cleansing
  1. Data source name.
  2. Give a meaningful name for the Data source name, for example leads.
  3. Data source type.
  4. Select the Data source type. ...
  5. CSV stream input.
  6. After you set the data source as CSV content stream, you can now set the CSV stream input. ...
  7. Data schema.

What is an example of validation? ›

Examples of validating statements. I can see that you are very (upset, sad, frightened, scared). I guess that must have been hard for you. I can see you are making an effort.

What is an example of data validity? ›

Data Validity Business Rules

Valid value combinations are rules that specify which combinations of values are allowed or disallowed. For example, there could be a business rule that surgery is always performed in a hospital and if the data shows otherwise, the data is invalid.

How do you apply Data Validation to a column? ›

Apply data validation to cells
  1. Select the cell(s) you want to create a rule for.
  2. Select Data >Data Validation .
  3. On the Settings tab, under Allow , select an option: ...
  4. Under Data , select a condition.
  5. Set the other required values based on what you chose for Allow and Data .

How to check if value exists in column SQL? ›

You can check if a value exists in a SQL table with the following syntax: SELECT COUNT() FROM table_name WHERE column_name = 'value'; This SQL statement will return the count of rows where the value exists in the specified column.

How to check if column exists in table SQL? ›

  1. Overview. In SQL, the COL_LENGTH() function is used to check the existence of the column in the database.
  2. Syntax. COL_LENGTH ( 'tableName' , 'columnName' )
  3. Parameters. This function takes the following two parameters: ...
  4. Return value. This function returns the length of the desired column.
  5. Example.

What are the 3 validation rules? ›

Validation rule and validation text examples

Value must be zero or greater. You must enter a positive number. Value must be either 0 or greater than 100.

What are the 6 levels of validation? ›

  • SIX LEVELS of VALIDATION.
  • Level One: Stay Awake and Pay Attention.
  • Level Two: Accurate Reflection.
  • Level Three: Stating What Hasn't Been Said Out Loud (“the unarticulated”)
  • Level Four: Validating Using Past History or Biology.
  • Level Five: Normalizing.
  • Level Six: Radical Genuineness.

How do you perform Data Validation and data cleaning in SQL? ›

How to leverage SQL Transformations for data validation and cleansing
  1. Data source name.
  2. Give a meaningful name for the Data source name, for example leads.
  3. Data source type.
  4. Select the Data source type. ...
  5. CSV stream input.
  6. After you set the data source as CSV content stream, you can now set the CSV stream input. ...
  7. Data schema.

How to validate Excel data in SQL? ›

How to Validate Data in Excel?
  1. Step 1 - Select The Cell For Validation. Select the cell you want to validate. ...
  2. Step 2 - Specify Validation Criteria. ...
  3. Step 3 - Under Allow, Select The Criteria. ...
  4. Step 4 - Select Condition. ...
  5. Step 5 - Input Message. ...
  6. Step 6 - Custom Error Message. ...
  7. Step 7 - Click Ok.
Apr 24, 2023

How to validate number in SQL? ›

SQL Server ISNUMERIC() Function

The ISNUMERIC() function tests whether an expression is numeric. This function returns 1 if the expression is numeric, otherwise it returns 0.

How to do a data validation check? ›

Find cells that have data validation rules
  1. On the Edit menu, point to Find, and then click Go To.
  2. Click Special.
  3. Select Data Validation.
  4. To find all cells with data validation, select All, and then click OK. To find cells that have the same validation rules as another cell, select Same, and then click OK.

How to clean dirty data in SQL? ›

How do you clean data in SQL?
  1. Remove irrelevant data. What's considered irrelevant data will vary based on the dataset. ...
  2. Remove duplicate data. ...
  3. Fix structural errors. ...
  4. Do type Conversions. ...
  5. Handle missing data. ...
  6. Deal with outliers. ...
  7. Standardize/Normalize data. ...
  8. Validate data.
Mar 19, 2023

How do you create Data Validation? ›

Create a Validation Rule
  1. Select the cells you want to validate.
  2. Click the Data tab.
  3. Click the Data Validation button.
  4. Click the Allow list arrow.
  5. Select the type of data you want to allow. Any value: No validation criteria applied. ...
  6. Specify the data validation rules. ...
  7. Click OK.

How to check if value exists in database SQL? ›

You can check if a value exists in a SQL table with the following syntax: SELECT COUNT() FROM table_name WHERE column_name = 'value'; This SQL statement will return the count of rows where the value exists in the specified column. If the count is greater than zero, then the value exists in the table.

How to validate date in SQL query? ›

SQL Server ISDATE() Function

The ISDATE() function checks an expression and returns 1 if it is a valid date, otherwise 0.

What is SQL validation rule? ›

Field validation rules Use a field validation rule to check the value that you enter in a field when you leave the field. For example, suppose you have a Date field, and you enter >=#01/01/2010# in the Validation Rule property of that field.

How to check if number or string in SQL? ›

To check if the given value is a string or not ,we use the cast() function. If the value is not numeric then it returns 0, otherwise it will return the numeric value. In this way, we can check whether the value is an integer or not.

How to verify condition in SQL? ›

The syntax of the SQL CHECK constraint is: CREATE TABLE table_name ( column_name data_type CHECK(condition) ); Here, table_name is the name of the table to be created.

References

Top Articles
Latest Posts
Article information

Author: Aracelis Kilback

Last Updated: 05/12/2023

Views: 6324

Rating: 4.3 / 5 (44 voted)

Reviews: 91% of readers found this page helpful

Author information

Name: Aracelis Kilback

Birthday: 1994-11-22

Address: Apt. 895 30151 Green Plain, Lake Mariela, RI 98141

Phone: +5992291857476

Job: Legal Officer

Hobby: LARPing, role-playing games, Slacklining, Reading, Inline skating, Brazilian jiu-jitsu, Dance

Introduction: My name is Aracelis Kilback, I am a nice, gentle, agreeable, joyous, attractive, combative, gifted person who loves writing and wants to share my knowledge and understanding with you.