Monthly Archives: December 2011

Remove duplicate records in sql

In MS SQL 2005 and later editions, a much better way has been provided to remove duplicate records smartly, while having full control of which records to remove. It can be accomplished by using the ROW_NUMBER() which will return a number of row within a partition of a result set. With this code:

ROW_NUMBER() OVER (PARTITION BY Clause ORDER BY Clause)

You can have a partition on something like Email or SSN or a combination of several fields like: First, Last, Street Number, Street Name and City.

 

The idea is to perform data cleansing while removing records where you can control any aspect of it. What's even better is the ability to combine different pieces of data from different records into one record while the rest of the records for that partition (group) are removed with no data being lost. Duplicate data is a problem that exists in many databases and the purpose of removing duplicate records is to improve productivity and increase the performance of your database. The below article will discuss the technique of how to remove duplicate records - rows in MSSQL database.

 

Let's create some sample data:

USE TempDB

-- Check if the table exists and if not then create it. 
IF NOT EXISTS(SELECT * 
    FROM Sys.Tables
    WHERE OBJECT_ID = OBJECT_ID(N'[dbo].[#TableWithDups]'))
    
    -- Create temp table
    BEGIN
    CREATE TABLE #TableWithDups(
        [PK]    TINYINT NOT NULL PRIMARY KEY,
        [EMail] [varchar](40) NULL,
        [First] [varchar](15) NULL,
        [Last] [varchar](20) NULL,
        [Gender] [varchar] (1) NULL, 
        [DOB] [varchar](8) NULL,
        [Phone] [varchar](10) NULL,
        [StreetNumber] [varchar](30) NULL,
        [StreetName] [varchar](30) NULL
    ) 
    END
-----------------------------------------------------------------
-- Check if some temp tables exist in the TempDB database -------
-----------------------------------------------------------------
ELSE
BEGIN
    TRUNCATE TABLE #TableWithDups
END

IF EXISTS(SELECT * 
    FROM Sys.Tables
    WHERE OBJECT_ID = OBJECT_ID(N'[dbo].[#DeDup_Testing]'))
BEGIN
    DROP TABLE #DeDup_Testing   
END

--------------------------------------------------------------------
--- Insert duplicates records into TableWithDups
--------------------------------------------------------------------
INSERT INTO #TableWithDups ([PK],[EMail],[First],[Last],[Gender],[DOB],[Phone],[StreetNumber],[StreetName]) 
VALUES (1,'MSSQL1@RemoveDupRecords.Com','Susan','Smith','F', '19600912','','1100','Main Street')

INSERT INTO #TableWithDups ([PK], [EMail],[First],[Last],[Gender],[DOB],[Phone],[StreetNumber],[StreetName]) 
VALUES (2,'MSSQL1@RemoveDupRecords.Com','Susan','Smith','',   '','1113058336','1100','Main Street')

INSERT INTO #TableWithDups ([PK], [EMail],[First],[Last],[Gender],[DOB],[Phone],[StreetNumber],[StreetName]) 
VALUES (3,'MSSQL1@RemoveDupRecords.Com','Susan','Smith','F','','1113058336','1100','Main Street')

INSERT INTO #TableWithDups ([PK], [EMail],[First],[Last],[Gender],[DOB],[Phone],[StreetNumber],[StreetName]) 
VALUES (4,'MSSQL2@RemoveDupRecords.Com','Joe','Smith','','19640825','2223058336','1100','Main Street')

INSERT INTO #TableWithDups ([PK], [EMail],[First],[Last],[Gender],[DOB],[Phone],[StreetNumber],[StreetName])  
VALUES (5,'MSSQL2@RemoveDupRecords.Com','Joe','Smith','M','19640825','2223058336','1100','Main Street')

INSERT INTO #TableWithDups ([PK], [EMail],[First],[Last],[Gender],[DOB],[Phone],[StreetNumber],[StreetName])  
VALUES (6,'MSSQL3@RemoveDupRecords.Com','Alex','Walker','M','','5553058888','55','KingFisher Road')

INSERT INTO #TableWithDups ([PK], [EMail],[First],[Last],[Gender],[DOB],[Phone],[StreetNumber],[StreetName]) 
VALUES (7,'MSSQL3@RemoveDupRecords.Com','Alex','Walker','F','19580912','','55','KingFisher Road')

INSERT INTO #TableWithDups ([PK], [EMail],[First],[Last],[Gender],[DOB],[Phone],[StreetNumber],[StreetName])  
VALUES (8,'MSSQL3@RemoveDupRecords.Com','Alex','Walker','','1952','','55','KingFisher Road')

INSERT INTO #TableWithDups ([PK], [EMail],[First],[Last],[Gender],[DOB],[Phone],[StreetNumber],[StreetName]) 
VALUES (9,'MSSQL4@RemoveDupRecords.Com','Uri','Pearl','M','1961','3333054422','110','Robertsville Road')

INSERT INTO #TableWithDups ([PK], [EMail],[First],[Last],[Gender],[DOB],[Phone],[StreetNumber],[StreetName])  
VALUES (10,'MSSQL4@RemoveDupRecords.Com','Uri','Pearl','','19600912','3333054422','110','Robertsville Road')

INSERT INTO #TableWithDups ([PK], [EMail],[First],[Last],[Gender],[DOB],[Phone],[StreetNumber],[StreetName]) 
VALUES (11,'MSSQL4@RemoveDupRecords.Com','Uri','Pearl','','','','110','Robertsville Road')

INSERT INTO #TableWithDups ([PK], [EMail],[First],[Last],[Gender],[DOB],[Phone],[StreetNumber],[StreetName]) 
VALUES (12,'MSSQL5@RemoveDupRecords.Com','Terry', 'Thurber','M','','4443058377','500','Larkin Street')

INSERT INTO #TableWithDups ([PK], [EMail],[First],[Last],[Gender],[DOB],[Phone],[StreetNumber],[StreetName]) 
VALUES (13,'MSSQL5@RemoveDupRecords.Com','Terry', 'Thurber','','19600809','','500','Larkin Street')

Below is some sample code to explain this idea of how to remove duplicates using the MS SQL ROW_NUMBER and partition idea:

SELECT 
    [First],
    [Last], 
    [Email],
    [DOB],
    ROW_NUMBER() OVER (PARTITION BY Email ORDER BY Email,DOB DESC,Gender) AS Rn, 
    StreetNumber,
    StreetName
FROM 
    #TableWithDups 

In the above, the Email works as a partition and has a unique value to allow us to create a group of records on that field and also control the order of how the records will show up. In the data below Susan Smith with the record that has the value of Rn = 2 and Rn=3 are considered as duplicate records and should be deleted.

 

In actuality the only difference compared to the original data that was created in the temp table named: TableWithDups is the additional column named: Rn. Every record in the Rn that is greater than 1 is considered as a duplicate one. Adding the ORDER BY on different fields can change the way the records will show up as Rn=1. See Sample 1 below to get the idea:

Sample 1:

Data cleansing remove duplicate records In SQL Database

 

When running the below code as Common Table Expression (CTE) flexibility is increased as shown below and a WHERE clause is added as WHERE Rn = 1. As a side note using the CTE is considered as a temporary result. The CTE can be self-referenced multiple times in the same query. It is similar to a derived table but much more flexible.

;
WITH DeDupe
AS
(
SELECT 
    [PK],
    [First],
    [Last], 
    [Email],
    [Gender],
    [DOB],
    ROW_NUMBER() OVER (PARTITION BY Email ORDER BY Email,DOB DESC) AS Rn, 
    StreetNumber,
    StreetName
FROM 
    #TableWithDups 
)   
SELECT [PK],[First],[Last],[Email], [Gender],[DOB],Rn,StreetNumber,StreetName
FROM DeDupe WHERE Rn = 1
ORDER BY [First],[Last]     

Here is the data as shown below in Sample 2 after running the above code:

Sample 2:
Find and remove duplicate records In MSSQL Data

 

As you see in the above in Sample 2, all duplicate records that have Rn > 1 do not show up anymore. But if you look closely we lost some data that could have been combined in the record with Rn=1. We can increase the partition by adding more fields to make sure there are no duplicate addresses, but that would not change the fact that some data will be lost.

 

The real power of the above technique is that we can combine several values from different duplicate records in the record that gets the value of Rn=1. Let's examine closely what we have above. If you look at Sample 2 then Joe Smith and Terry Thurber lost the data in the Gender column. Looking at Sample 1 under PK= 5, the Gender is M for Joe and also PK = 13 which is Terry Thurber. The same problem exists also for the Phone number as well. The following code will take care of this problem and incorporate all required pieces of data in one record.

;
WITH DeDupe
AS
(
SELECT
    PK,
    [First],
    [Last],
    [Email],
    [Gender],
    [DOB],
    MAX(Phone) OVER (PARTITION BY Email) AS Phone_Updated ,
    MAX(CASE WHEN CAST(DOB AS INT) > 0 THEN CAST(DOB AS INT) ELSE DOB END) 
    OVER (PARTITION BY Email) AS DOB_Updated,
    MAX(Gender) OVER (PARTITION BY Email) AS Gender_Updated,  
    ROW_NUMBER() OVER (PARTITION BY Email,First,Last,LTRIM(LEFT(UPPER([StreetNumber]),4)) + LEFT(UPPER([StreetName]),4)  
    ORDER BY DOB DESC,Phone DESC,Gender DESC,Email,First,Last) AS Rn, 
    [Phone],
    [StreetNumber],
    [StreetName]
FROM #TableWithDups 
) 
SELECT 
    PK,
    [First],
    [Last],
    [Email],
    [Gender],
    [Gender_Updated], 
    [DOB],
    CASE WHEN CAST(DOB_Updated AS VARCHAR(10))='0' THEN '' ELSE CAST(DOB_Updated   
    AS VARCHAR(10)) END DOB_Updated,
    [Phone],
    [Phone_Updated],
    [Rn],
    [StreetNumber],
    [StreetName]
INTO #DeDupe_Testing FROM DeDupe WHERE Rn > 0

-- Let's show some data
SELECT * FROM #DeDupe_Testing WHERE Rn=1

The below data is the result of running the above code:

 

Sample 3:

Remove duplicate records in sql server and merge records to one

The below is the continuation of the above data.

 

Data cleansing- find and remove duplicate records in ms sql server

 

Using the MAX(Phone) with the partition technique on each field that we don't want to lose data, will make sure to get the data in the row that Rn=1. Having this code on the Phone MAX(Phone) OVER (PARTITION BY Email) AS Phone_Updated and create a new column to show the new data near the original column helps to analyze the data to make sure everything works fine.

 

In the above code we make sure that no data is lost while we remove all duplicate records. When all partitions are handled then we create ROW_NUMBER() OVER (PARTITION…) on all combinations of fields that create a unique partition. We also ORDER each field that was part of the MAX() as DESC. Order the field as DESC is crucial and is part of the technique of merging all data from different rows into one row of the same partition. In addition, I added code around the DOB (Date of Birth) since in some cases it has only 0 value and clearing the DOB to nothing is a better data for me.

 

In sample 3 all fields such as: Gender_Updated, DOB_Updated and Phone_Updated are filled with data while all other duplicate records were removed. In Sample 3 Joe Smith and Terry Thurber did not lose any data compared to Sample 2. In addition, the idea is to copy all records to a different table while leaving the original data as UN touched for a backup.

 

The DOB is a VARCHAR (10), since when import data it is always a character based field that comes in many different formats. For additional info on that matter please see this article at: data cleansing and manipulating date format.


Conclusions:
==========

Once you have found your duplicate records, you face the problem of merging multiple records into one without losing any data. Some of the problems you might face when merging records could be that it has a phone or another record could have an email address and you cannot afford to remove these duplicate records when you are taking the risk of losing some important data. The duplicate record remover that was discussed above covers all of these points and provides a solution that safely removes duplicate records from your database. In addition, to use the code above the Sample 3 data is when you know for sure that you may have fields with empty data; otherwise simply use the code above the Sample 2 data.

ERP Software Selection

Approach to Selecting ERP Software for your SMB Company

 

You are an owner of a small or medium size company, and you need to select ERP software to run your business. You are confused between the large availability of ERP software packages. How do you select the right one for your business, and how do you find one you can afford?

 

But first, you may be asking yourself what is ERP software and why do I need it in the first place? ERP stands for Enterprise Resource Planning. This software is designed to manage end-to-end processes of an enterprise. A typical ERP software help manage processes such as sales and order entry, customer invoicing, procurement, manufacturing, inventory control, shipping, quality control, and accounting just to name a few. For more information about ERP, please refer to my blog titled The Benefits of ERP Software for Small or Medium Size Businesses.

 

Like a typical small and medium size business, you may be running your business by deploying many spreadsheets, Access databases, and leveraging popular QuickBooks or Peachtree accounting software. However, your business is growing, and you are exceeding the number of users and records limits of QuickBooks, and managing all these spreadsheets is starting to prove inefficient and increasingly error prone - costing you real money in lost sales, missed shipments, lost inventory of some items and shortages of others. Besides, QuickBooks Enterprise seemed to have enough functionality to get you started, but you heard that real ERP software can automate many of your business processes.

 

You heard of ERP software, and your initial research unveiled that there are many packages starting from the top tier SAP and Oracle systems that cost millions and ending with many PC based packages. So, how do you select the one that fits your business, and how do you find a package that will not consume your entire budget?

 

It turns out that selecting a complex software package that runs almost all aspects of your business is both art and science. Standard ERP selection methodologies were developed by management consulting firms. They are based on documentation of the existing business process and careful analysis and documentation of business requirements. This traditional approach includes the following main activities:

 

  • Document the As-Is business processes to understand the current situation
  • Document the To-Be business processes to be deployed in the new environment
  • Document business requirements for the software including any regulatory requirements
  • Document technical requirements - such as performance criteria, requirement for computing environment (for example, must run on Windows OS)
  • Identify software candidates and document gap analysis between software capability and above
  • Develop a detailed RFP based on above information and send to selected software ISVs
  • Analyze the responses, and invite the first cut of ISVs to perform demos
  • Develop final ranking by criteria that is based on best functional fit, technology, and cost considerations
  • Negotiate with final two candidates and select one to proceed to contract

 

Two main challenge of this approach to selecting software for a SMB is that a) it takes considerable effort, time, and expense, and b) discourages taking advantage of best-practices already built into an ERP package - by specifying in too much detail what and how you want the package to work, you may miss some best business practices that may be built into a candidate package, or end-up with a package that is overly complex for your needs.

 

Based on our extensive experience selecting and implementing enterprise software, we developed a more efficient approach specifically for small and medium size businesses (SMBs) -- one that assures due-diligence in software selection while optimizing the time and effort of the selection process.

 

First of all, as an SMB, you have to recognize that modern Open Source ERP software offer viable solutions to many companies. Today's Open Source ERP offers competitive functionality to traditional ERP software at a very low-cost that you can afford (in principle, open source is free, but most customers would not be able to implement the free version).

 

Our ERP software selection process is based on the following 10 steps:

 

    
1.    Identify the top level processes that are needed to be supported by the future ERP solution. For example, order 
       entry, procurement, manufacturing, inventory control, shipping.   
2.    Identify next level process characteristics. For example make-to-order, process/batch manufacturing, cross-dock
       distribution processes, drop shipment support, etc.
3.    Identify industry specific regulatory requirements such as GxP and Computer Validation FDA requirements for Life
       Sciences industry.
4.    Walk through the existing company operations review and note the following:
           a.  Unique business processes that as competitive advantage and critical for your company operations
           b.  Unique software that was developed in-house or purchased that is critical for your company operations
5.    Identify any technical requirement specific to your company (such as must use Apple OS, or prefer Windows, 
       or prefer minimum IT support).
6.    Based on the above information, develop a high level functional fit document.
7.    Match the fit document with a list of best-in-class Open Source ERP vendors and identify one package that can 
       address your needs at a lowest cost.
8.    Document and present expected customization requirements (if any) to support your business.
9.    Bring the ERP software vendor (ISV) to perform demo.
10.  Get final buy-in and negotiate agreements with ISV and implementation partner.

 

For a typical SMB customer, the selection process may take less than a week. The traditional management consulting based process takes months to complete.

Our ERP software selection methodology assumes that an experienced ERP consultant help with the evaluation and selection process. The success and efficiency of our approach is largely dependent on the ability of our experienced consultants to help you in identifying your core business needs and matching them with ERP software capability.

 

We are able to distinguish between those specific processes that are truly your competitive advantage and need to be preserved, versus those that could be changed and adapt best-practices already embedded in ERP software. Recognizing this important distinction between your custom business processes is more art than science and largely dependent on experience of our consultants and principals and is the key to a successful selection of your future ERP solution in a minimum amount of time.

 

Therefore, if you need ERP software to grow or maintain the growth of your business, you should know that the selection and implementation of such software does not have to cost of a fortune in money and time. There are affordable solutions in the market place, and partnering with the right services provider, will help you kickoff a successful CRM and ERP implementation effort.



Data compression in large tables in MS SQL server

Do you know that you can compress data while using MS SQL 2008 or any later version? Compression is available for now in SQL Server 2008 Enterprise and Developer edition.

 

This type of compression reduces the size of the table by at least of 25% and more. It also depends on how big your table is.

 

When you deal with TB(s) of data and you have a RAID setup in your dedicated server for your data with 5 SCSI HD you may want to make sure that you are not running out of disk space. For best performance I still use SCSI HD with 15000 RPM.

 

Compression can allow more rows to be stored on a page, but does not change the maximum row size of a table or index.

 

MS SQL data compression provides another great benefit, since compressed data is stored in fewer pages and therefore, queries need to read fewer pages from the disk and thus, improves the performance of I/O intensive workloads. However, some additional CPU resources are required on the database server to compress and decompress the data, while data is exchanged within the application. That additional resource based on my experience is not that significant at all. I am using 2 CPUs in my server and it works just fine.

 

In my case since I work with large amount of data and use PAGE compression. Row Compression helps to store data more efficiently while using 4 bits per compressed column to store the length of the data in the column. NULL and 0 take no additional space

 

Page compression works better for me since it is performing Row Compression plus optimizes the storage of multiple rows in a page, and this way minimizes the data redundancy.

 

To compress indexes, you must explicitly set the compression property of the indexes. By default, the compression setting for indexes will set to NONE when the index is created.

 

On a PAGE compression, a compressed index which could be clustered or non- clustered, the leaf level pages are compressed as PAGE, but no leaf pages are ROW compressed.

 

The leaf level is the data of the table. It is composed of data pages. All columns of the tables are stored for every row of the table.

 

No Leaf level - are composed of index pages.

 

Let's run the below stored procedure to see what will be the savings when compressing a 44,058.422 table which is 44GB.

 

--Using the PAGE compression 
USE Users 
GO 
EXEC sp_Estimate_Data_Compression_Savings NULL, 'Users', NULL, NULL, 'PAGE' ; 
GO  

 

See this link: for the this stored procedure named: sp_estimate_data_compression_savings (Transact-SQL)

 

The results shows 1 record only and the column named: Sample_size_with_requested_compression_settings(KB) has the value of 27816 which was a savings of 16242.422 KB. That is %36.87 and it was calculated in here: 1 - (27816/44058.422) * 100 = %36.87

 

Now let's create a compressed clustered index on the Users table and see the actual size of the table. Pay attention that I added the the DATA_COMPRESSION = PAGE at the end of the index script.

 

CREATE UNIQUE CLUSTERED INDEX Users_PK ON dbo.Users 
(
    Users_PK
)   
WITH (
    PAD_INDEX  = OFF, 
    STATISTICS_NORECOMPUTE  = OFF, 
    SORT_IN_TEMPDB = OFF, 
    IGNORE_DUP_KEY = OFF, 
    DROP_EXISTING = OFF,
    ONLINE = OFF, 
    ALLOW_ROW_LOCKS  = ON, 
    ALLOW_PAGE_LOCKS  = ON,
    DATA_COMPRESSION = PAGE)

 

In this case I have no indexes on the Users table and therefore creating the clustered index would be relatively faster since no other indexes should be rebuilt on that table. Creating the compressed clustered index will compress the table as well, at the same time.

 

To create the index on 44GB table and 240 million records took about 23.27 min and the index size was just 54.750MB. The actual amount of kb for the Users table was reduced to 29361.078. The savings was: 1 - (29361.078/44058.422 * 100) = %33.358. For me it is a major savings also in disk space.

 

We can also use the below command in MS SQL 2008 or above to enable PAGE compression while trying to re build the HEAP (When the table does not have a clustered index) and requires all non-clustered indexes on the table to be rebuilt so that they have pointers to the new row locations in the heap.

 

ALTER TABLE Users 
REBUILD WITH (Data_Compression = PAGE)

 

For more info about the MS SQL data compression features see this subject SQL Server Books Online.

 

Conclusions:
-----------------

Using the compression capability of MS SQL is great and saves HD space and increases performance and reduces cost. It increases the query performance by reducing I/O.In addition, no application changes are required to implement table or index compression.

 

Usually if you need to create a clustered index I would remove first all non-clustered indexes since they will not be compressed when they will be re-built while a compressed clustered index is created.

 

Make sure to set up Compression as DATA_COMPRESSION = PAGE on non-clustered indexes after you created a compressed clustered index. This way you take full advantage of the compression capability of data and indexes which was added in MS SQL 2008. In addition, see this article of how to remove duplicate records in a smart way.

Benefits of ERP Software for Small or Medium Size Businesses

If you are a typical stakeholder of a small or medium size business, you probably prefer to invest in assets that contribute directly to the growth and revenue of your business (for example, tools for making your product, machines to expand capacity, people to design and sell your product, etc.). You know that information technology and business software is important, but you avoid spending capital in this area, as long as you can manage by using spreadsheets and other office software, and at most utilize the popular QuickBooks to manage your accounting, invoicing, and do very basic inventory management.

 

You may have heard of Enterprise Resource Planning (ERP) software that large companies use to run their businesses, but you think that you cannot afford this software anyway, and you are not sure what would be the real benefits of such an investment. This article is going to dispel some of these myths - ERP software has a real value for your business and it does not have to be expansive at all!

 

First, I would like to quickly define what ERP software does and how it works. ERP software has been around for at least 20 years and it was developed from a previous class of software called MRP (Material Requirement Planning) that existed at least 20 years before the emergence of ERP. ERP software was developed to manage most of the business related processes of a typical enterprise. ERP can be customized to fit many industries including general manufacturing, retail, CPG, Life Sciences, Process, distribution, and service industries.

 

The main benefit of ERP software is that it integrates the flow of information between various business processes and integrates business transactions with company's financial system in real time. ERP enables improved business decisions by making information from various parts of the business available to throughout your business operations in real-time. It posts relevant business transactions to company's general ledger and other financial systems allowing management to see financial implications in real time. ERP typically covers all processes starting from sales order entry, shipping/delivery, product pricing, warehousing and inventory, procurement, production, invoicing (both A/P and A/R), service, financials, and many others.

 

For example, when entering a sales order, customer service person (or your web based client) can see the available inventory for this sales order in the various warehouse locations. If no inventory is available at the time of the order, ERP system can allocate planned production or planned purchases, or estimate when a make-to-order could be completed by your manufacturing operations. ERP can also apply the proper pricing for the applicable customer, sales tax, and shipping charges, and determine promised delivery date. Once the order is entered, the system reserves the inventory to fulfill this order. The system then automatically plans the required production orders to replenish the sold inventory and related external purchases of raw material (or finished goods depending on whether you make the product in-house or buy it from supplier). Please note that this is just one example of a typical ERP process - ERP transactions can be adapted to a large variety of industry and process requirements.

 

ERP contains critical master data for customers, vendors, products (some system refer to items or materials), bill of materials (BOMs), manufacturing work processes and routings, assets, and many other master records. Maintaining these master records would help your company to improve consistency and work from the same set of data.

 

Many small companies may be able to get by using office tools such as spreadsheets and simple accounting software (e.g., QuickBooks or Peachtree), while most of the work is done by company owners and the number of employees is small (although, some SMBs have complex business processes that require ERP almost from the start). However, as your revenue and employee grow, the lack of proper integrated business software will start impact your business results in real ways that impact your bottom line such as lost orders, poor planning, inventory shortages, missed shipments, missed invoicing, and general lack of control.

 

ERP software have many real business benefits and including the following:

 

  • Improved overall management visibility to business operations
  • Improved visibility to company's financial performance in real-time
  • Improved sales performance
  • Improved inventory control and planning
  • Consistent business policies and processes throughout your company improving customer service and profitability
  • Improved on-time delivery
  • Reduced procurement cost
  • Reduced work-in-process inventory
  • Reduced labor costs (for both office operations and manufacturing)
  • Reduced overall waste

….and many more. ERP benefits should be identified and documented for your specific business.

 

The availability of user-friendly QuickBooks software provided a good stop-gap solution for many small businesses, but with all the advancement of QuickBooks functionality, it is still far from being integrated ERP software with proper controls, auditing, and scalability required to support end-to-end business operations. Features that make QuickBooks user friendly (such as ability to reverse transactions), make it a terrible software for controlling a growing business with growing number of employees. SMBs should consider upgrading QuickBooks to ERP as soon as they grow beyond five to ten employees and over $5 million in annual revenue (this is just a general rule of thumb). Many companies should consider implementing ERP light software from the start to avoid migration and re-training efforts.

SMBs have as complex business processes and need as large corporations. Large corporations invested millions in implementing state-of-the-art ERP software packages, but can SMBs afford ERP? It is clear that there are many business benefits to justify investment in this critical software, but as stated in the beginning of this article, SMBs are sensitive to cash outlays. The good news is that the emergence of Open Source ERP software substantially reduced the cost of ERP.

 

Open Source software existed for many years. Many have heard of the Open Source Linux operating system. Open Source software is developed by a community of developers that have no commercial interest in the software they develop. Therefore, in its pure form, Open Source software is free! In the case of Open Source ERP, however, it is not entirely free; because Open Source ERP ISVs (independent software vendors) typically have a basic free version and then offer a more capable version that contains close source code or charge required support and maintenance fees. However, the cost of Open Source ERP software license is still about 10 times lower than the cost of traditional close source ERP software. In recent years, Open Source ERP considerably advanced from a functional perspective becoming a viable alternative to traditional ERP software.

 

Leveraging Open Source ERP software, small and medium size businesses (SMBs) can now afford to implement enterprise class software that was available up until recently only to large corporations. They can scale and grow beyond QuickBooks and Peachtree at a very low investment cost, or implement from the start scalable ERP light solution. Since ERP is such an important enabler of business success, SMBs cannot afford to overlook this opportunity.

 

Written By Simon Raban

Secure MS SQL server when running web application

When people need to connect publically, those individuals are taking the risk of outside attacks on your server. A year and half ago I noticed that 2 hackers are trying to get into my MS SQL 2008 server via the SA account.

 

The Windows Log account basically provided me with every attempt to log into the server plus the IP address of the hacker. One was in Canada and the other was from China. I called the Internet provider of the individual from Canada, and they shut down his Internet, the person from China was another story, I sent them an email with no reply.

 

But with some simple steps we can avoid the attacks on the server. The first step is to change the default port which is 1433 to any new one that are you comfortable with, but try in a different range then the current default port.

 

Secondly, disable the SA account and you can use the below code for that. Make sure that you have a valid admin account before you do that.

ALTER LOGIN sa DISABLE

Then create a trigger script to validate an authorized user based on a table that stores all known credentials of the users. The trigger should be found under Server Objects->Triggers.

 

Before you create the trigger, let's create the table with the following fields in a different database then your data. It could be Utils.mdf

   
CREATE TABLE [dbo].[Logins](
    [Logins_PK] [Int] IDENTITY(1,1) NOT NULL,
    [Name] [VarChar](40) NULL) 
ON [PRIMARY]

Now we can populate this table with all authorized accounts and the below code will add the Administrator's account into the table. The Principal_ID is the ID number of the Principals table. It is unique within that table.

INSERT INTO Logins(Name) 
SELECT Name FROM Sys.Server_Principals
WHERE Principal_ID IN (273)

A Windows Login is an example of an indivisible principal. If you ran the below code then you could see all accounts available, and then you need to make a decision which one should be added into the Logins table.

SELECT Name FROM Sys.Server_Principals

The below were the first accounts that were added by me:

MyDataBaseName\Administrator 
NT AUTHORITY\SYSTEM
NT SERVICE\MSSQLSERVER
NT SERVICE\SQLSERVERAGENT

Then Insert other users' accounts that should be able to log in. I would add one account at the time using the above INSERT sample code.


Here is the Trigger code:
----------------------------------

CREATE TRIGGER [AuditLogin] 
--server means instance level
ON ALL SERVER  
WITH EXECUTE AS SELF
FOR LOGON
AS BEGIN
    IF ORIGINAL_LOGIN() NOT IN (SELECT Name FROM Utils..Logins) 
    ROLLBACK;
END
GO

SET ANSI_NULLS OFF
GO

SET QUOTED_IDENTIFIER OFF
GO

ENABLE TRIGGER [AuditLogin] ON ALL SERVER
GO

Conclusions:
===========
With the simple steps I described above I increased the security of my server tremendously. Changing MS SQL port is a crucial first step. Then when ever you log into the server the AuditLogin trigger is called which checks accounts found in the Logins table. If the account is found in the table then the Log In will be continued by the server. If not then the trigger will roll back that account. Using that technique you can basically disable a user's account by Deleting that record from the Login table without disabling the actual account in the server.

Reasons For Hiring an SEO Company in USA

Google is the most democratic nation in the landscape of the World Wide Web where every website or business is given an equal opportunity to represent itself and compete for the top slot, without any prejudice. This analogy may sound a bit bizarre but this is the reality of the world governed by the Big G. It is like a ray of hope for small and medium sized companies which want to compete with the corporate giants on the same footings.

 

If you are a start-up company or even an established brand with billions of dollars in resources you still need to show Google that you are all about quality if you wish to get the top positions in search engine rankings. But a very obvious question which strikes our minds is why a strong Google ranking is so important. It is important because it is the easiest way of driving traffic to your site without any cost. In other words it is the best mode of advertisement as more than 85% of the online world uses search engines to find information, and if your site ranks in the top three then you are going to reap the benefits and profit for your business. The power of Google placement is so strong that market pundits say that if you are not visible in first few pages of search results you do not exist.

 

In the market of today, due to stiff competition and constantly changing parameters of search engines it becomes a daunting task to rank well without the assistance of an SEO Expert in the USA. The Farber Consulting Group is a leading SEO (Search Engine Optimization) Company in the USA which has led several online SEO campaigns for its clients for years.

The team of SEO Experts at the leading SEO Company in the USA, The Farber Consulting Inc takes care of all the aspects of on-page and off page optimization of client sites at the guaranteed lowest prices.

 

If you are looking for an SEO Expert Agency who can take care of all your online marketing needs, contact us and your business will be known to many more people and generate much more money.

String Manipulation on large data using LIKE operator or PATINDEX

The idea is to perform data cleansing on the Date and Time field from different formats into a unify format such as: YYYYMMDD. Here is a sample data to be used for the data cleansing that I will perform below.

 

  
CREATE TABLE #T 
    (
        D1 [VARCHAR] (20) NULL 
    )
INSERT INTO #T VALUES('5/15/2006 18:11')
INSERT INTO #T VALUES('5/13/2006 17:24')
INSERT INTO #T VALUES('2007-02-07 11:38:07')
INSERT INTO #T VALUES('9/4/2008 6:55 19a')
INSERT INTO #T VALUES('5/11/2006 0:47:52')
INSERT INTO #T VALUES('2007-01-04 12:43:00')
INSERT INTO #T VALUES('')
INSERT INTO #T VALUES('  6/11/2006 0:47:52')
INSERT INTO #T VALUES('9/22/2002 07:20:07')
INSERT INTO #T VALUES('09/22/2002 07:20:07')
INSERT INTO #T VALUES('4/13/2007')
INSERT INTO #T VALUES('2/1/2007 20:23') 
INSERT INTO #T VALUES('04-21-2011 18:47:42')
INSERT INTO #T VALUES('04-21-2008 17:11:00')
INSERT INTO #T VALUES('2/1/2007 11:44') 
INSERT INTO #T VALUES('5/9/2007 9:22')
INSERT INTO #T VALUES('04/13/2007')
INSERT INTO #T VALUES('')
INSERT INTO #T VALUES('2004-04-21 16:00:00')
INSERT INTO #T VALUES('04-21-2008 17:11:00')
INSERT INTO #T VALUES('2004')
INSERT INTO #T VALUES('2010-04-21 16:00:00')
INSERT INTO #T VALUES('0000-04-21 16:00:00')
INSERT INTO #T VALUES('2011')
INSERT INTO #T VALUES('2004')
INSERT INTO #T VALUES('1700')
INSERT INTO #T VALUES('1800')
INSERT INTO #T VALUES('Nov 2 2010')
INSERT INTO #T VALUES('02-22-03 00:00:00')
INSERT INTO #T VALUES('02-22-03 00:00:00')
INSERT INTO #T VALUES('10/5/1941 8/2/2005')
INSERT INTO #T VALUES('30.09.2002')
INSERT INTO #T VALUES('05:05:06' )

ALTER TABLE #T ADD [D2] VARCHAR(20) 
ALTER TABLE #T ADD [Update_Type] TinyInt 

SELECT * FROM #T

The reason that I am using string manipulation instead of the CONVERT() function is: Since the table could have millions of dates that could not be converted using the CONVERT() function and therefore string manipulation is required. Even though most records could use the CONVERT() function, I will still would rather use the string manipulation strategy since in one transaction I convert all dates whether they can use the CONVERT() or not. See examples for these dates:

SELECT CONVERT(VARCHAR(15),CAST('30.09.2002' AS DATE),112)

The ISDATE() function will return 0 but this date could be converted as 20090930 and I could have 5 millions of these dates.


If you ran the above code you will get the below error:

Msg 241, Level 16, State 1, Line 1 Conversion failed when converting date and/or time from character string.


Here is another example of date that should be using string manipulation:

SELECT CONVERT(VARCHAR(15),CAST('1/27/2007 6:55 17a' AS DATE),112)


The actual date could be 20070127 but ISDATE() returns 0 as seems below.

SELECT ISDATE('1/27/2007 6:55 17a') 


The following are 4 patterns that I will use for this article: Each different pattern will be converted to this format as: YYYYMMDD. So the date that appears as this one 9/19/2009 will be shown as this one: 20090919. The above is a realistic data that I need to deal when import data and then perform data cleaning using MS SQL scripts.


1. 9/19/2009
2. 1/1/2009
3. 09/25/2010
4. 2009-01-09

Extracting this format of date: 9/19/2009

SELECT 
    D1
FROM
    #T  
WHERE  D1 <> '' AND D1 LIKE '%[0-9][/][0-9][0-9][/][0-9][0-9][0-9][0-9]%' 

Adding the left wild card will bring un desirable results of dates as shown below with different formats. You need to use only the right wild card so the requested pattern will show up as needed. This way you can perform the string manipulation without any surprises. As a rule of thumb I also make sure all data is trimmed before starting working on it. (For the import I use SSIS).

--5/15/2006 18:11
--5/13/2006 17:24 --5/11/2006 0:47:52 -- 6/11/2006 0:47:52 --9/22/2002 07:20:07 --09/22/2002 07:20:07 --4/13/2007 --04/13/2007

1. When running the below code for this format: 9/19/2009 you get the right data.

SELECT 
    D1
FROM
    #T  
WHERE  D1 <> '' AND D1 LIKE '[0-9][/][0-9][0-9][/][0-9][0-9][0-9][0-9]%' 

2. Extracting this format of this kind of date: 1/1/2009

SELECT 
    D1
FROM
    #T  
WHERE D1 <> '' AND D1 LIKE '[0-9][/][0-9][/][0-9][0-9][0-9][0-9]%' 

Running the above code with the right wild card guarantees the right format as shown below.

--9/4/2008 00:38:45
--2/1/2007 20:23
--2/1/2007 11:44
--5/9/2007 9:22

3. Extracting this format of date: 09/25/2010

SELECT 
    D1
FROM
    #T  
WHERE D1 <> '' AND D1 LIKE '[0-9][0-9][/][0-9][0-9][/][0-9][0-9][0-9][0-9]%' 

The below data shows up

09/22/2002 07:20:07
04/13/2007 

4. Handle this format: 2009-01-09

SELECT
    D1
FROM 
    #T 
WHERE D1 <> '' AND D1 LIKE '[0-9][0-9][0-9][0-9][-][0-9][0-9][-][0-9][0-9]%' 

Running the above query will bring the below records:

--2007-02-07 11:38:07
--2007-01-04 12:43:00
--2004-04-21 16:00:00

Every pattern I will test as shown below to make sure I am getting the right data. In real table I will have a Clustered index on the PK and will add WHERE clause like this one: User_PK BETWEEN 1 AND 10000 and this way the data comes much quicker.

SELECT
    D1,
    REPLACE(LEFT(D1,10),'-','') AS ResultDate
FROM 
    #T 
WHERE D1 <> '' AND D1 LIKE '[0-9][0-9][0-9][0-9][-][0-9][0-9][-][0-9][0-9]%' 

Running the above query will bring the below records:

--2007-02-07 11:38:07  20070207
--2007-01-04 12:43:00   20070104
--2004-04-21 16:00:00   20040421

For the table that you perform the data cleansing, the idea is to create another column that all updates will be implemented there. This way I can compare and see if the conversion was done correct. The Tiny column that also was added, I like to use it as a flag for the different type of conversions. In reality there could be at least 20 types of conversions that could be needed for the full conversion for the date field. See below code that will convert all different types of dates into YYYYMMDD in one script.

UPDATE #T
    SET D2 = CASE WHEN D1 LIKE '[0-9][/][0-9][0-9][/][0-9][0-9][0-9][0-9]%' THEN 
                SUBSTRING(D1,6,4) + '0' + REPLACE(LEFT(D1,5),'/','')            -- 9/19/2009
             WHEN D1 LIKE '[0-9][/][0-9][/][0-9][0-9][0-9][0-9]%' THEN 
                SUBSTRING(D1,5,4) + '0' + LEFT(D1,1) + '0' + SUBSTRING(D1,3,1)  -- 9/9/2009
             WHEN D1 LIKE '[0-9][0-9][/][0-9][0-9][/][0-9][0-9][0-9][0-9]%' THEN 
                SUBSTRING(D1,7,4) + REPLACE(LEFT(D1,5),'/','')                  -- 09/19/2009
             WHEN D1 LIKE '[0-9][0-9][0-9][0-9][-][0-9][0-9][-][0-9][0-9]%' THEN
                REPLACE(LEFT(D1,10),'-','')                                     -- 2009-01-09
         ELSE D2 END, 
    Update_Type = CASE WHEN D1 LIKE '[0-9][/][0-9][0-9][/][0-9][0-9][0-9][0-9]%' THEN 1 -- 9/19/2009
           WHEN D1 LIKE '[0-9][/][0-9][/][0-9][0-9][0-9][0-9]%' THEN 2                  -- 9/9/2009
           WHEN D1 LIKE '[0-9][0-9][/][0-9][0-9][/][0-9][0-9][0-9][0-9]%' THEN 3        -- 09/19/2009
           WHEN D1 LIKE '[0-9][0-9][0-9][0-9][-][0-9][0-9][-][0-9][0-9]%' THEN 4        -- 2009-01-09
        ELSE D2 END
FROM 
    #T  
WHERE D1 <> ''    

SELECT * FROM #T 

Running the above SELECT statement shows the updated columns as shown in the below:

--D1                    D2          Update_Type
--5/15/2006 18:11       20060515    1
--5/13/2006 17:24       20060513    1
--2007-02-07 11:38:07   20070207    4
--9/4/2008 00:38:45     20080904    2
--5/11/2006 0:47:52     20060511    1
--2007-01-04 12:43:00   20070104    4
--                      NULL        NULL
--  6/11/2006 0:47:52   NULL        NULL
--9/22/2002 07:20:07    20020922    1
--09/22/2002 07:20:07   20020922    3
--4/13/2007             20070413    1
--2/1/2007 20:23        20070201    2
--04-21-2011 18:47:42   NULL        NULL
--04-21-2008 17:11:00   NULL        NULL
--2/1/2007 11:44        20070201    2
--5/9/2007 9:22         20070509    2
--04/13/2007            20070413    3
--                      NULL        NULL
--2004-04-21 16:00:00   20040421    4
--04-21-2008 17:11:00   NULL        NULL

The entire update technique should be used within a batch and please see this article I wrote about it and how to implement it:

 

/computer-consulting-blog/2011/1/14/processing-hundreds-of-millions-records-got-much-easier.aspx

 

With this technique it keeps the log file very small

 

If you prefer to run each process one at the time then you can use the code as shown below for this format: 9/19/2009

UPDATE #T
    SET D2 = SUBSTRING(D1,6,4) + '0' + REPLACE(LEFT(D1,5),'/',''),
    Update_Type = 1
WHERE D1 <> '' AND D1 LIKE '[0-9][/][0-9][0-9][/][0-9][0-9][0-9][0-9]%' 

I also tried to test the PATINDEX() function instead of the LIKE operator as shown below. In this case I actually compared the PATINEX() function to the LIKE operator. Running in a 300 million records table, the LIKE was always faster in most cases at least in 1 second. The LIKE is getting to be by far very slow when there are only small amount of records such as: 14030 records found for a specific pattern in big tables as mentioned above.

 

Running the same pattern using the PATINDEX() function took only 10 seconds while using the LIKE took 1 min and 9 seconds. Usually I ran the COUNT() to get the idea how many records to be processed for each pattern. In most cases I will use the LIKE operator for this kind of data cleansing. Make sure to create an index on the field that you process data on it. Remember to create the same type of field for UPDATE purpose. This one took only 10 seconds to count 14030 records out of 300 million records.

SELECT  -- 10 seconds
    COUNT(D1)
FROM
    #T  
WHERE PATINDEX('[0-9][0-9][-][0-9][0-9][-][0-9][0-9][0-9][0-9]%',D1)=1 AND D1 <>'' 

This one took 1 min and 9 seconds for the same table.

SELECT 
    COUNT(D1)
FROM
    #T 
WHERE D1 <> '' AND D1 LIKE '[0-9][0-9][-][0-9][0-9][-][0-9][0-9][0-9][0-9]%' 

----------------------------------------------------------

------------------------ Conclusion: ------------------

----------------------------------------------------------

I covered some steps I take for different types of the data conversion and in the process I realized that the PATINDEX() could be much faster versus the LIKE operator for some conditions. There could be different types of data that cannot be converted and must be ignored. That's why a second field is nice to have, since that will be the final field for the production database.

 

There are many dates that have different patterns and could be legit ones for conversion but will fail the CONVERT() function. Only string manipulation provides the ultimate control and minimize any errors of date conversion.

Processing hundreds of millions records got much easier

MS SQL 2008 - Processing hundreds of millions records can be done in less than an hour.

 

Processing hundreds of millions of records requires a different strategy and the implementation should be different compared to smaller tables with only several millions of records.

 

Say you have 800 millions of records in a table and you need to delete 200 million. Obviously you can use this code:

DELETE FROM Users WHERE DOB = ''

The problem is that you are running one big transaction and the log file will grow tremendously. The process can take a long time. It also depends on the speed of your server as well.

 

The solution is to use small batches and process 1 to several millions of records at the time. This way the log file stays small and whenever a new process starts, the new batch will reuse the same log file space and it will not grow. Before I used the batch process technique, in one time I had 250 GB log file when I tried to split the Address field.

 

If I need to move 250 millions of records from one database to another the batch processing technique is a winner. Under my server it would take 30 minutes and 41 seconds, and also can track down the time per each batch. This way I will be able to predict when the entire process is finished. Another advantage when using small batch is if you need to Cancel the process from whatever reason then it takes immediately or several seconds to recover. (Depends on your server speed).

 

I basically use this technique even for a small change that I may do in a table. Say you need to add an Identity field and you have a table with 250 millions of records. That simple code:

 
ALTER TABLE Users ADD Users_PK INT IDENTITY(1,1)

will take at least 7 hours to process.

 

Copy the whole table to an empty one will be much faster as demonstrated below. Another example for saving time is if you need to add a computed field as a Persisted one, it took us more than a day without using the batch technique for a table of 250 millions of records. The time it takes also depends of the complexity of the computed field. A persisted computed field was part of the empty table where data was inserted and that did not change the speed of the below process.

 

Another advantage for using ms sql batch processing code is when you have an error. It is very helpful to debug ms sql table. In my case it could be a truncate error when trying to fix data from one field to another. By looking at the BatchProcess table you can see the last processed batch range and you can go right into that range and inspect the data. You know it is the last batch since the code will stop working after the error occurred. When process hundreds of millions records sometimes bad data could cause a truncate issue.

 

--- Inspect the data based on the last batch.
SELECT
   COUNT(StreetName)
FROM
    Users
WHERE Users_PK BETWEEN 6000007 AND 7000007 AND LEN(StreetName) > 30

 

The technique below requires that you have a clustered index on the PK, and this way 1 million records takes to process from 8 to 30 seconds compare to 10 minutes without a clustered index. But even without the clustered index working with batches reduces the processing time by far. To split an Address to Street Number and Street Name without a clustered index took about 8 hours and before it took days to process. For that process an UPDATE was used. For the below process even though I used the ORDER BY First and Last, the clustered index on Users_PK was sufficient for the entire process and no other indexes were needed. Again in other cases you may need to have additional indexes.

 

BEGIN TRY

--- Copy the PK from the source table into the target table-------
SET IDENTITY_INSERT Users..Users ON
------------------------------------------------------------------

-- Increment every 1000000 records. You can choose any number that works for you.
DECLARE @Increment INT = 1000000,
        @LastRecord_PK INT = 0,
        @iRowCount INT = 0,
        @BatchRowCount INT = 0,
        @iCurrentRowCount INT = 0,
        @FirstRec_PK INT = 0
        
-- Get the min of the PK minus 1. This way we can start the range from the first record.
-- Make sure to choose the right source table for the MIN(PK)
DECLARE @FirstRecord_PK AS INT=(SELECT
    MIN(Users_PK)
FROM
    Users.dbo.Users) - 1

SELECT @FirstRecord_PK

-- Select the database for the table that records each INSERT
USE Utilities
IF NOT EXISTS(SELECT *
            FROM Sys.Tables
            WHERE OBJECT_ID = OBJECT_ID(N'[dbo].[BatchProcess]'))

BEGIN
CREATE TABLE [Utilities].[dbo].[BatchProcess](
    [Batch_PK] [INT] IDENTITY(1,1) NOT NULL,
    [RowCountTemp]  [INT] NULL,
    [BatchRowCount] [INT] NULL,
    [DateTimeEntered] [DATETIME] NULL,
    [FirstRecord_PK] INT NOT NULL,
    [LastRecord_PK] INT NOT NULL
) ON [PRIMARY]       
END
ELSE
BEGIN
-- Remove all records from the previous process.
TRUNCATE TABLE Utilities.dbo.BatchProcess
END

-- Create an Error handler table if not found.
IF NOT EXISTS(SELECT *
            FROM Sys.Tables
            WHERE OBJECT_ID = OBJECT_ID(N'[dbo].[Errors]'))

BEGIN
CREATE TABLE [Utilities].[dbo].[Errors](
    [Errors_PK] [INT] IDENTITY(1,1) NOT NULL,
    [Error_No] [INT] NULL,
    [ErrorMessage] [VARCHAR] (4000) NULL,
    [ErrorLine] [VARCHAR] (50) NULL,
    [Date_Time] [DATETIME] NULL,
 ) ON [PRIMARY]
END

--- Starting the time and that will be the first record in the BatchProcess table
INSERT INTO Utilities.dbo.BatchProcess
    (RowCountTemp,BatchRowCount,DateTimeEntered,FirstRecord_PK,LastRecord_PK)
VALUES
    (0,0,GETDATE(),@FirstRecord_PK,@LastRecord_PK)

USE Users
WHILE EXISTS (SELECT TOP 1 Users_PK FROM Users.dbo.Users WHERE Users_PK > @LastRecord_PK)
BEGIN

    -- Set the values for the counters, the first and last record.
    SELECT @FirstRecord_PK = @FirstRecord_PK + 1,
            @LastRecord_PK = @FirstRecord_PK + @Increment

    ------------------------------------------------------------------------------
    --- In here you place your code to process your records ----------------------
    --- Make sure to add the range for the PKs as shown below --------------------
    ------------------------------------------------------------------------------           
        --- Insert records into the target table
    INSERT INTO Users..Users(
           [Users_PK],
           [First],
           [Last],
           [StreetNumber],
           [StreetName],
           [City],
           [State],
           [Zip],
           [Phone],
           [Date_Uploaded]
           )
    SELECT
           [Users_PK],  
           [First],
           [Last],
           [StreetNumber],
           [StreetName],
           [City],
           [State],
           [Zip],
           [Phone],
           SUBSTRING([Date_Uploaded],1,8) AS Date_Uploaded
    FROM
        Users.dbo.Users Ue
    WHERE Ue.Users_PK BETWEEN @FirstRecord_PK AND @LastRecord_PK
    ORDER BY [Ue].[Last],[Ue].[First]
    ------------------------------------------------------------------------------

    -- Gets the number of rows that were processed per batch        
    SET @iCurrentRowCount = @@ROWCOUNT
    -- Gets the number of rows for all batches that were processed so far                                           
    SET @iRowCount = @iRowcount + @iCurrentRowCount
    -- Gets the number of rows per batch
    SET @BatchRowCount = @iCurrentRowCount
    -- Gets the first PK that was processed per batch
    SET @FirstRec_PK = @FirstRecord_PK
    -- Gets the next first PK for the next batch
    SET @FirstRecord_PK = @LastRecord_PK
    PRINT @iRowCount
    
    --- Keep track of each process
    INSERT INTO Utilities.dbo.BatchProcess
        (RowCountTemp,BatchRowCount,DateTimeEntered,FirstRecord_PK,LastRecord_PK)
    VALUES
        (@iRowCount,@BatchRowCount,GETDATE(),@FirstRec_PK,@LastRecord_PK)

END
END TRY

-- Let's check for any error and if so let's log the error.
BEGIN CATCH

INSERT INTO [Utilities].[dbo].Errors
    (
        Error_No,ErrorMessage,ErrorLine,Date_Time
    )
VALUES
    (
    ERROR_NUMBER(),ERROR_MESSAGE(),ERROR_LINE(),GETDATE()
    )

SELECT
    ERROR_NUMBER()      AS ErrorNumber ,
    ERROR_SEVERITY()    AS ErrorSeverity ,
    ERROR_STATE()       AS ErrorState ,
    ERROR_PROCEDURE()   AS ErrorProcedure ,
    ERROR_LINE()        AS ErrorLine ,
    ERROR_MESSAGE()     AS ErrorMessage;
END CATCH;

 

Whenever the above code is running you can run the below code and see the status of the process:

SELECT * FROM Utilities.dbo.BatchProcess ORDER BY DateTimeEntered DESC


In the below image the time difference between rows 7 and 8 was 8 seconds, and in rows 1 to 2 it was 7 seconds, and so far 6,957,786 records were processed, and that batch was 994804 records. You can see the range of PK that was processed as well.

 

Table1

 

Looking further towards the end of this process, then the difference between rows 242 to 243 is 8 seconds as well. As you see you can have a very good estimate of the time for the entire process.

 

Table 2

Welcome

Welome to our blog. Check our articles when you have a chance.
We are a Custom Software Development company and will be glad to assist you with your office automation needs.

 

We provide office automation where we convert your manual procedures into a long term software solution where you can grow withit without the need to re-write it. We also provide affordable ERP and CRM solutions.

We also provide MS SQL Database Management and optimize your database performance.