Wednesday, July 02, 2014

SAP ABAP Perfection - Fixing performance issues - The SELECT statement

In my last post, I had discussed about the simple ways on increasing the performance of your program. Continuing where I left off, this post discusses the subtle improvements you can make to your SELECT statement. It will bring you substantial performance improvements without much effort.
I have tried to answer some basic questions. Please feel free to skip those of you are already aware about the same.

Do you need to read this post?
If you have ever faced performance issues with the programs you work on our are supporting your clients on. If not, read this post for some general knowledge ;)

What is the SELECT statement?
The SELECT statement is one of the very important statements of ABAP language which enables us to fetch data from the underlying database.

What is the basic syntax?
As mentioned above the use of the SELECT statement is to get specific data set from some table(s) and store it in one or more variables. 

SELECT [n fields] FROM [table] INTO [Internal Table] 
WHERE [condition 1] AND [condition 2].

Simplifying -

SELECT [what] 
FROM [where]
INTO [store in]
WHERE [based on which conditions].

How to improve performance? 
You will be happy to know that we can improve performance in each section of the select statement easily. Just remember the rules discussed below and you should do just fine.

SELECT [what] - In this section, we define which fields we need to select from the database. 
  • Use only those fields which are required instead of selecting all the fields in the table. This not only improves performance, but also uses much less memory space for execution
  • Put the fields to be selected in the order in which they are maintained in the table

FROM [where] - This is where we mention the table or views from where we want to get the data. 
  • Use FOR ALL ENTRIES instead of JOINS. 'FOR ALL ENTRIES' restricts the data set for querying the second table (when we want data from two tables together). This avoids the huge memory space required by the JOIN to execute. 
  • Use database VIEWS instead of JOINS. Database views have a performance advantage over dynamic JOINS and can help you to select the data faster

INTO [store in] - In this section, we define where we want to put the data selected from the table. 
  • Get all the data required by your program into an internal table using a single SELECT statement instead of doing multiple selects. You can separate the data programmatically into different internal tables once you have the full data set. This ensures that you avoid multiple database hits and thus increase performance dramatically. Also, separation of data in the program requires much less time than several database hits
  • Avoid selecting the table entries into a work area inside a loop. This is by far the biggest performance problem that brings the system to a halt. In the old days, developers had to use the SELECT-ENDSELECT statement to fetch data. Now, you have the option of selecting into an internal table. Use it without fail

WHERE [based on which conditions] - This section is used to define the conditions based on which the select query will fetch the data.
  • Conditions should be based on the index in the table. E.g.: if the primary index in the table is maintained as [field1, field2 and field3], then the conditions in the WHERE clause should also be like - [field1 = var1 AND field2 = var2 AND field3 = var3]. Basically, following the sequence of the index tells the SELECT statement to use that index while querying the table. If you do not follow the sequence of index, then it depends on the system query analyzer to find the best index possible. In many cases, it does not find any index to use resulting in a huge performance impact
  • Avoid using the OR condition. In the open sql language, the OR condition is converted into multiple selects resulting in more database hits than you planned for. Try to skip the OR condition and later delete the extra data selected programmatically. This will give you better performance as lesser number of database hits will be executed

So there you are guys, these were the simple steps you can follow to get a lot of juice from your SELECT statement. Of course, there are not the only steps. There are much more and need a case to case analysis. 

Feel free to put in your comments and suggestions.

Sunday, June 22, 2014

SAP ABAP Perfection - Fixing performance issues - The Read Table statement

Hello fellow ABAPers, this is the first post in the SAP ABAP Perfection series, aiming to help you fix those performance issues you face regularly.

Have you faced performance issues with your programs lately? 
Are you working with huge sets of data (usually not expected in test systems but proving to be a nuisance in productive ones) ? 
Do you have a lot of programs in your SAP system which regularly go into long run?

I, like the most of us, in our project or support life-cycle, have faced these situations. We spend loads of efforts and man hours trying to find out why and what caused this sudden performance problem. Sometimes the problem becomes so critical that entire production lines are required to stop due to job failures or long running jobs. This has serious financial consequences for your company or client. 

The Good News - 
Now that I have shared the worst of the situations with you, time for some good news. All these performance problems can be fixed  and avoided easily by using the following statements correctly - 

  • The SELECT statement
and,
  • The READ TABLE statement

Let's discuss the READ TABLE statement
In this discussion, I will give you a few hints on how to use the READ TABLE statement correctly. These hints can be applied to 90% of the times you use the statement. For the remaining 10%, please refer to the documentation of READ TABLE in the SAP documentation (F1 Help).
  
What does the Read Table statement do?
Put simply, the Read Table statement is used to read the contents of a single row of an internal table. Optionally, it can use user given conditions to pinpoint the exact row required by the user.

Which other ABAP statement has a similar functionality?
The LOOP AT statement (and it's modifications) in ABAP can also be used to read the contents of an internal table.

What is the difference between READ TABLE and LOOP AT statement? 
The LOOP AT statement will always do a linear search (read each entry one by one) to find the correct entry matching the user given conditions. While the READ TABLE statement can make use of the binary search technique (divide and search) to find the correct entry. The binary search option is substantially faster as the number of entries in the internal table grows. As a comparison of the speed: to search a list of one million items takes as many as one million iterations with linear search, but never more than twenty iterations with binary search.

What are the best ways to use READ TABLE?
Now that we already know that the Read Table statement is superior than the Loop At statement as far as searching is concerned, let us dwell on the best ways to use the Read Table statement.

  • Always use the INDEX option or the BINARY SEARCH option wherever possible
  • When using BINARY SEARCH, make sure that the internal table is sorted with those keys
e.g.: 

Read using INDEX option

READ TABLE [itab] INTO [variable] INDEX 1.

The above statement is going to read the contents of row 1 (as INDEX 1 is being read) of internal table [itab] and store it in the [variable]. This means that the user is requesting the Read Table to fetch the contents of row 1 directly without searching the internal table. This statement can be used in situations where we are sure that only the first (or any specific entry) is required to be read.

e.g.: We have a list of Sales Orders in internal table [itab_so] and we want to read the details of the latest Sales Order. In such a case, we can easily sort the [itab_so] in descending order of the creation date and read the first entry in it. This means that once we have sorted the [itab_so] in descending order, we are sure that we will require only the first entry (or INDEX 1) of the internal table.

Read using the BINARY SEARCH

SORT [itab] BY [key_1].
READ TABLE [itab] INTO [variable] WITH KEY [key_1] = [variable_1] BINARY SEARCH.

The above READ TABLE statement is going to search for the entry in [itab] where the field [key_1] matches the [variable_1] input given by the user. The search algorithm to be used here is binary search as specified by the final section of the statement. As already mentioned in a previous section, sorting is mandatory when using binary search.

Conclusion

If you keep these easy to use options in mind and implement them in your programs, then you will notice a definite increase in performance. The larger the data set you are working with, the greater the performance increase.

With these suggestions, I leave you to ponder over them and maybe try and implement them in one of your demo programs to see the difference.