|
Techforum
Distributed queries
Article
summary
Every organisation has data in multiple databases. Very often we need to perform
queries and updates across different databases of the same or different type.
The Distributed queries feature of SQL Server
provides you with all the facilities required to manage heterogeneous data
in a secure manner.
This article explains the concepts, configuration and
security aspects of Distributed queries.
As an example, a demo of how to combine SQL Server
and Excel data in a single SQL Query is also shown. Further, adding/updating
Excel data with SQL Server commands is also highlighted.
Heterogeneous databases are almost always present in
any organisation. Different applications require or demand a particular type
of database. When data is stored in multiple locations and different database
engines, it gets scattered. However, business users always need reports across
the databases. Cross-application data querying can be done using one of the
following approaches:
1. Upload data from various sources to a common database.
2. Perform queries across databases without data migration.
Obviously the second approach is more appealing because
it does not involve the additional overhead of copying and uploading data repeatedly.
In SQL Server, Distributed queries are performed using
Linked servers.
What is a linked server?
A linked server is any OLEDB or ODBC data source. It
could be another SQL Server database running elsewhere. In addition, it could
be practically any ODBC or OLEDB data source.
This has immense potential because there are so many
database drivers available. Here is a list of possibilities:
1. All existing Clipper or FoxPro databases are now
accessible.
2. All databases from all vendors including Oracle, DB2, Sybase, Informix, etc
can now participate in a native SQL Server query.
3. There are OLEDB providers for non-database things like Active Directory,
Exchange, Indexing Service (free text search) and so on. All these are accessible.
For example, if you are using Active Directory to keep valid usernames, you
can simply perform a select * operation to get the list.
4. In a worst-case scenario, it could even be a simple comma delimited ASCII
file!
Consider the following query:
Select a.productname, b.qty
from product a, <external> b
where a.productid = b.productid
Here the table b could actually be from
any valid OLEDB or ODBC data source.
Whats more, you could also change the data in the linked server tables
by using standard SQL Syntax.
Update <external> a
Set qty = 30
Where a.productid = 434
This is also possible.
Now you can imagine the potential uses of this feature.
Please note: Linked servers are not the same as Remote
servers. A Remote server is another SQL Server that is registered with the base
SQL Server. A user connected to the base server can execute stored procedures
on the Remote SQL Server.
Adding link servers
This is a very simple task provided you know details
of the OLEDB/ODBC provider. Like most tasks in SQL Server, this task can also
be performed using SQL commands, Enterprise Manager interface or SQL DMO object
library.
For this demo, let us see the Enterprise Manager based
connection.
1. Open the Enterprise Manager tree for the base SQL
Server.
2. Expand the Security node.
3. Right-click on the Linked Servers node and choose New Linked Server
4. Specify the linked server name in the dialog. Remember this name, because
this is used in all queries related to that server.
5. The dialog that appears is rather cryptic and requires further explanation
to avoid errors. All that the dialog is asking you is the details of the OLEDB
provider for the linked server. However, this dialog is also used for ODBC providers.
For ODBC support, you have to use the OLEDB provider for ODBC. This makes the
process slightly complicated to understand initially.

* Configuring OLEDB providers:
This is fairly simple.
1. Choose the driver.
2. Specify the connection details.
3. Specify the provider details.
This image shows how to configure a linked server for
the NorthWind Access database.
A very good chart of all configuration options for commonly used linked server
data sources is provided in the online help in the topic sp_addlinkedserver.
* Configuring ODBC providers:
1. Before starting linked server configuration, make
sure there is a SYSTEM DSN for the required ODBC provider. User DSNs are not
usable here.
2. Test the base ODBC DSN in some other tool. (Excel is the quickest way to
test a DSN, by the way.)
3. Now open the Linked server configuration dialog and choose the driver OleDB
provider for ODBC drivers.
4. Type any name in Product Name.
5. Specify the System DSN in the Data Source textbox.
6. Type the OLE name for the Provider String. This is the tricky part. If you
do not enter the right name here, the linked server simply would not work.
7. The best way of finding this name is to open the registry and check out the
name. Choose HKEY_Local_Machine\software\odbc\odbc.ini
\<your system dsn name>\. In this section look for a data item called
FIL. The value mentioned there is the Provider String value.
8. Alternatively the entire ODBC connection string can also be inserted here
(DSNless connection).
9. Once the configuration information is entered, choose OK.
* Viewing the data:
1. Once the configuration is appropriate, the linked
server database is analysed and two branches are added to the linked server
tables and views.
2. When you click on Tables, the tables inside the database are listed.
Now the linked server is ready to be used.
Querying data
This is the simplest part. We have already specified
the linked server name. To access a table in the linked server you use the syntax
linkedserver
tablename. The intervening three dots indicate
various addressing conventions, which we are using defaults for. This is equivalent
to pubs...authors we often use.
Now querying data is simple.
Use the regular syntax for all SQL queries while referring
to the linked server in the way shown above.
Security
Depending upon the type of linked server, the security
considerations will differ. For example, some providers may not have any security
mechanism at all. If there is security implemented by the linked server provider,
then you need to perform a one-time configuration. Later, all heterogeneous
queries will be internally broken up by SQL Server. The queries which belong
to the linked server will be called with appropriate security credentials automatically.
The
requirement for security is simple. When the base SQL Server is calling the
linked server, it needs to know which credentials to use. The username and password
used in the base connection needs to be mapped to appropriate username and password
for each linked server.
In case both the servers are using Active Directory
of Windows 2000 and they support automatic Security Account Delegation, this
mapping can be done automatically. This is called Self Mapping.
However in most cases, you will need to perform this
mapping manually. You will then need to specify the user-id and password manually.
This is done in the Security tab of the Linked server
properties dialog for a particular linked server connection.
As you can see, the local login sa is mapped to the
guest login on the linked server. Also note the various options available in
case there is no mapping specified.
Excel example
Now let us take a full-scale example to demonstrate all this.
* Scenario:
This demo illustrates linking of one master table in
SQL server with a child table in Excel. We will also see how to update data
in Excel using SQL Server syntax.
* Create a table in SQL Server:
1. Move to pubs database.
2. Give the following SQL commands to create the base table and populate it
with data.
create table linkdemo (itemid int, itemname char(30))
go
insert into linkdemo values (1, Pencil)
insert into linkdemo values (2, Pen)
insert into linkdemo values (3, Eraser)
insert into linkdemo values (4, Chalk)
go
3. The table is simple enough so we will not discuss
this further. This is the product master.
* Create a table in Excel:
You must be wondering how to create tables in Excel?
A range of cells with a specific name assigned to it is equivalent to a table
in Excel from linked server point of view.
1. Open a new Excel sheet.
2. Type the following data in it.
| 1 |
44 |
12-Oct-03 |
| 2 |
39 |
10-Oct-03 |
| 1 |
33 |
12-Sep-03 |
| 3 |
11 |
11-Sep-03 |
| 4 |
32 |
4-Sep-03 |
| 2 |
45 |
5-Aug-03 |
3. Now highlight these contents and type sales
in the Name Box in Excel. The Name Box is the area above the top left corner
of the Row and Column names. We just gave a name sales to the block
of data containing sales details. This name would now be interpreted as a table
name when we link this sheet to SQL Server.
4. Save and close the Excel file. The file should not be open while configuring
the linked server.

* Create a system DSN for Excel:
1. Start Data Sources from Administrative Tools.
2. Click on System DSN tab.
3. Click Add.
4. Choose Excel driver.
5. In the dialog, select the correct path of the Excel file.
6. Name the DSN LinkedExcel.
* Create the linked server:
1. Open the New linked server dialog.
2. Specify a name linkexcel to the linked server.
3. Enter other
values as shown in the diagram.
4. Now choose OK.
5. Cross check that, in the Tables node, the table sales appears.
* Query the linked server:
1. Start Query Analyzer.
2. Run the query:
Select * from linkedexcel
sales
3. Check that the data in Excel is listed correctly.
* Query it with local SQL Server table:
Type this query:
select itemname, qty
from linkdemo a, linkexcel...sales b
where a.itemid = b.itemid
The results will show the correct data. This is just
to demonstrate how easy it is to combine heterogeneous data in this case
from the most unlikely source Excel!
* Update data in linked server:
Type the query:
update linkexcel...sales
set qty = 10
where itemid=1
Open the Excel file and see for yourself.
When to use linked servers?
Whenever you need to use heterogeneous data in a single
query, you can use a linked server.
There are some prerequisites:
1. The linked server must always be available. This
should not be across a temporary link
2. If the linked server is likely to return large amounts of data, the bandwidth
should be adequate. Ideally, linked servers run best across a LAN.
3. If you need distributed transactions, the linked server data source must
support distributed transactions.
Summary
Heterogeneous data access can be made very simple using Linked Servers and Distributed
Queries. Think of your infrastructure and explore areas where you could use
this feature to your advantage.
 |
About the Author:Dr Nitin
Paranjape is the Chairman and MD of Maestros (Mediline). He is a consultant
with many organisations, covering appropriate technology utilisation, business
application of relevant technology, application architecture and audit as
well as knowledge transfer. He has authored more than 650 articles on various
technology-related subjects. He can be contacted at nitin@mediline.co.in |
|