Learn
10 useful tips and tricks that help boost SQL Server application
performance—and streamline development and deployment as well.
学习10个有用的技巧来提高SQL Server应用程序的运行效率,简化开发和部署。-by Roman Rehak, 本文属于转载,谢谢原作者Roman Rehak.
Developers love database programming tips, tricks, and workarounds—especially those that slash development time or boost application performance. Here's a collection of such nuggets for developing SQL Server applications.
[@more@]1) Replace COUNT(*) With EXISTS When
Checking for Existence (当检查记录是否存在时,使用EXISTS代替COUNT(*))
Developers often use the
value from the COUNT(*) function when enforcing business rules in Transact-SQL
code. However, try the EXIST clause instead if you're using the COUNT(*) value
only to evaluate whether you have at least one row that meets certain
conditions. For example, consider this code from the Northwind database:
IF (SELECT COUNT(*) FROM Orders
WHERE ShipVia = 3) > 0
PRINT 'You cannot delete this shipper'
The execution plan shows that SQL Server has to read all
255 rows in the Orders table before evaluating the IF expression. You can
achieve the same result more efficiently with EXISTS because the IF condition
evaluates to true as soon as SQL Server finds the first occurrence of 3 in the
ShipVia column:
IF EXISTS (SELECT * FROM Orders
WHERE ShipVia = 3)
PRINT 'You cannot delete this shipper'
The
difference in total execution time isn't much in a sample database such as
Northwind, but use this efficient query against an Orders table with millions
of rows and you'll see a major speed improvement.
2) Be Careful When Using WHERE IN and WHERE NOT IN(谨慎使用WHERE IN 和 WHERE NOT IN)
SQL Server doesn't
always choose an optimal execution plan when you have a substantial list of
values in the WHERE IN clause. Using WHERE IN and WHERE NOT IN clauses in T-SQL
code can produce an execution plan involving one or more nested loops. This
increases the number of comparisons SQL Server must perform exponentially. Use
the WHERE IN clause only if you have a short list of values you need to
evaluate:
USE Northwind
--This query takes 9 ms to execute
SELECT *
FROM Customers
WHERE CustomerID NOT IN
(SELECT CustomerID FROM Orders)
Replace the WHERE IN clause with OUTER JOIN if you're
using a subquery to generate a potentially large list. Doing so can improve
performance significantly:
USE Northwind
--This query takes 3 ms to execute
SELECT c.*
FROM Customers c
LEFT OUTER JOIN Orders o
ON o.CustomerID = c.CustomerID
WHERE o.CustomerID IS NULL
In this case, the second query uses LEFT OUTER JOIN,
producing an execution plan that lets it run about three times faster than the
first query.
The LEFT OUTER JOIN selects all rows from the Customer
table—whether or not a customer placed any orders—and joins them with the
Orders table. Then the WHERE clause filters out the rows where the columns from
the Orders table have NULL values. Either way, you get a list of customers who
placed no orders, but the second way gives SQL Server a lot less work to do. I
rewrote a query recently using this technique, and the execution time went from
50 seconds to about 500 ms.
3) Randomize
Resultset Orders With NewID() (使用NewID()来产生随机记录集)
You occasionally
might need to randomize the order of the resultset retrieved from SQL Server.
This is often the case in database searches where certain products or services
would gain unfair advantage against others based simply on their name. I've
seen a few clever (and not so clever) solutions for randomizing resultsets, but
the solution is actually simple. You can use the NewID() function in
Transact-SQL to generate a GUID for each row, then order the results by the
generated GUID:
SELECT * FROM Products
ORDER BY NEWID()
SQL Server returns products in a different order every
time you run the query. You also can use this technique to return a random row
from a table:
SELECT TOP 1 * FROM Products
ORDER BY NEWID()
However, be careful when using this technique with large
tables. You're only asking for one random row, but the execution plan shows
that SQL Server gives you that random row only after reading each row in the
table, generating a GUID for each row, then sorting all the rows. Consequently,
SQL Server needs several seconds to give you a random row from a table with a
few million rows. So don't use the "SELECT TOP 1…" technique on huge
tables. Instead, restrict the random selection to a subset of the large table.
Select a random range, then use "SELECT TOP 1…" within that range.
4) Increase Default Packet Size for Large Data Fields(对大的数据字段,提高缺省的数据包容量)
SQL Server client
applications communicate with instances of SQL Server through Net-Libraries.
Client Net-Libraries and Server Net-Libraries communicate over the network by
exchanging network packets. The size of network packets depends on the data
access API you're using. The default packet size is 4,096 bytes in
These sizes work well in
most scenarios, but sometimes you can improve data exchange velocity greatly by
increasing packet size—especially if you're sending or receiving large amounts
of XML data, or if you're storing images in SQL Server. The client and server
libraries must exchange and process fewer packets when you increase packet
size. Maximum packet size is 32,767 bytes. You can increase packet size in
"…;Packet
Size=32767;…"
The speed gained in data upload and download depends on
the size of your data fields and on your network topology. One of my applications
stores XML data in SQL Server, with each XML field about 500K in size. My
benchmarks show that the application exchanged XML data with SQL Server about
twice as fast after increasing the packet size to 32,767 bytes.
5) Name Your SQL
Server Applications(命名SQL
Server应用程序)
You can often easily
identify what process belongs to which application when you examine SQL Server
activity in Enterprise Manager or with SQL Server Profiler, because the
Application column contains a value instead of being blank. SQL Server receives
the application name from the client data access API. You've probably
noticed that .NET applications often show ".Net SqlClient Data
Provider" as their name. This happens when you don't specify the
application name when you create an ADO.NET connection. ".Net SqlClient
Data Provider" becomes the useless value ADO.NET winds up sending to SQL
Server.
You can easily populate the Application Name property with
a unique name for your application. Set the application name in
"…;Application Name=MyApplication;…"
Populating the Application Name property simplifies identifying your application activity on the server; you can use this name as the value of the Application Name filter in SQL Server Profiler traces. This can help you with distributed applications where your app might be running on multiple servers and possibly using multiple databases. Use Application Name as the filter, then run a single trace and capture the workload specific to your app coming from multiple client computers.
6) Utilize SQL Server Aliases(使用SQL Server别名)
SQL Server Client
Network Utility allows you to create alias names for SQL Server instances. The
utility also lets you associate each alias with a specific network protocol.
Once you create an alias, any connections that use the aliased name will be
directed to the corresponding server name or IP address.
Aliases are beneficial in many scenarios. For example, you
can alias an IP address of a remote SQL Server to a friendly name, or you can
create an alias if your application requires the Named Pipes protocol instead
of TCP/IP. The biggest benefit of using SQL Server aliases comes from being
able to redirect all connections from a client computer to another SQL Server
quickly and easily. One common scenario is using redirection to a standby SQL
Server for high availability. Consider what happens if your primary server goes
down. You need to point all applications to the secondary server in a flash.
This might require modifying many connection strings if you have a Web server
with many Web sites—or you can simply alias the name of the primary server to
the name of your secondary server.
Figure 1. Create SQL Server Aliases.
You can set up a SQL Server alias in the SQL Server Client Network Utility. Any
connections to MyServer end up connecting to MyBackupServer instead, without
you modifying any connection strings. You set aliases on computers where your
SQL Server Client Network Utility is a part of SQL Server
client tools, but you don't have to install the client tools on every computer
you want to use. The MDAC pack install also includes this utility, which you
can run by executing "cliconfg" from the command prompt if SQL Server
client tools aren't installed.
7) Disable Triggers Instead of Dropping Them(使Triggers失效,而不是删除)
Business rules in
table triggers often expect your application to update the table one row at a
time. Also, some triggers generate an error when the code in the trigger
assigns to a local variable the value returned by selecting a column from the inserted virtual table. The assignment
fails if you're updating multiple rows because the inserted table contains more than one row, so the subquery
returns more than a single value.
Multirow updates need
special handling in such a scenario. Developers often wind up dropping a
trigger before multirow updates, then creating them later to resume single-row
updates. However, I recommend disabling such a trigger instead:
ALTER TABLE MyTable
DISABLE TRIGGER MyTrigger
You can re-enable the trigger once you finish your
multirow data updates:
ALTER TABLE MyTable
ENABLE TRIGGER MyTrigger
8) Don't Rename Script-Based Objects(不要改名基于SQL脚本创建的对象)
You'll often create a
script-based object, such as a view, a stored procedure, or a user-defined
function. SQL Server then stores the script that created the object in the text column of the syscomments system table. SQL Server
also creates a row in the sysobjects
table, as it does for most database objects.
The problem arises when
you rename the object in Enterprise Manager or with the sp_rename stored procedure.
Unfortunately, SQL Server then only modifies the object name in the sysobjects
table. It doesn't modify the script in the syscomments table, so the script
remains stored with the old name.
The new object will work,
but you'll get an error the next time you try to modify the object in
Enterprise Manager, because the object name in the script doesn't match the new
name. You might also end up deploying the object to another server with the old
name inadvertently, because Enterprise Manager generates scripts by reading the
syscomments table. The "CREATE …" part of the script winds up using
the old name. Save the original script, delete the object, then re-create it
with the new name.
9) Don't Trust Generated Table Modification Scripts(不要相信自动生成的数据表修改脚本)
Enterprise Manager,
Visual Studio, and third-party tools let you add, drop, and modify table
columns in a visual, easy-to-use environment. You click on Save; then they
generate and execute Transact-SQL scripts under the hood. It's tempting to
modify tables this way—after all, they shield you from the complexities of data
definition language.
However, the scripts these
visual tools produce often generate more SQL than required, or they generate
scripts that might not be suitable in a production environment. These scripts
often drop and re-create the table you're modifying instead of using the ALTER
TABLE command. Consequently, large tables (with millions of rows) can chew up
cycles before the data is copied back to the table.
You don't need to stop
using these tools, though. Fortunately, they let you save modification scripts
before they execute. Use this option, and inspect the generated script before
executing it. I suspect you'll find that a few simple ALTER TABLE commands will
speed up the same modifications and obviate having to drop and re-create the
table.
10) Examine Your Apps With SQL Server Profiler(通过SQL Server Profiler工具测试应用程序性能)
Allocate a few hours
at the end of the development cycle to run a trace in SQL Server Profiler and
capture your application's workload. This lets you identify any potential
long-running queries and create additional indexes for optimizing data
retrieval.
As a rule, I create a
primary key and a clustered index on each table. Next, I index each foreign
key, then I create additional indexes to speed up certain queries. Doing a SQL
Server trace allows me to quickly identify any opportunities for useful indexes
I might have missed.
My trace includes the Stored Procedures-RPC:Completed event
and the TSQL-SQL:BatchCompleted
event. Both of these events populate the Duration column. I usually investigate
trace events that take longer than 100 ms. I paste the code into Query Analyzer
and examine the execution plan. You'll need some practice with reading and
interpreting execution plans, but after a while you should be able to spot any
potential issues in the execution plan quickly. You should always run a trace on your production
database to make sure it's performing well with tables containing production
data, rather than the sample amounts in your development database.