lundi 31 août 2015

SQL update with sub-query takes too long to run

I have a SQL update query that runs with my test data, but doesn't complete (2 hours or more) with my production data.

The purpose of the query

I have an ADDRESSES table that uses code strings instead of IDs. So for example ADDRESSES.COUNTRY_CODE = "USA" instead of 3152. For referential integrity, I am changing these code strings to code IDs.

Schema

ADDRESSES

  • ADDR_ID (PK)
  • COUNTRY_CODE (varchar)
  • Address line 1 (varchar)
  • etc.

COUNTRY_CODES

  • CODE_ID (PK)
  • CODE_STRING (varchar)
  • etc.

Steps

First, I create a temporary table to store the address records with the appropriate code ID:

CREATE TABLE ADDRESS_TEMP
AS
   SELECT ADDR_ID, CODE_ID
     FROM    ADDRESSES
          LEFT JOIN
             COUNTRY_CODES
          ON ADDRESSES.COUNTRY_CODE = CODE_STRING

Second, I null the COUNTRY_CODE column and change its type to NUMBER.

Third I set the COUNTRY_CODE column to the code IDs:

UPDATE ADDRESSES
   SET COUNTRY_CODE =
          (SELECT ADDRESS_TEMP.CODE_ID
             FROM ADDRESS_TEMP
            WHERE ADDRESS_TEMP.ADDR_ID = ADDRESSES.ADDR_ID)

It is this third step that is taking hours to complete (2 hours and counting). The ADDRESSES table has ~356,000 records. There is no error; it is still running.

Question

Why isn't this update query completing? Is it dramatically inefficient? I think I can see how the sub-query might be an N2 algorithm, but I'm inexperienced with SQL.



via Chebli Mohamed

Aucun commentaire:

Enregistrer un commentaire