DELETE FROM schema1.table1 a
WHERE a.ROWID >
ANY (SELECT b.ROWID
FROM schema1.table1 b
WHERE a.col1 = b.col1
AND a.col2 = b.col2
AND a.col3 = b.col3);
Monday, May 5, 2008
Removing duplicate rows from a table
Removing duplicate rows from a table has several ways. I prefer below method, because it is easy to implement. For table, schema1.table1 having columns col1, col2, col3 which will be primary key or unique key columns after removing duplicate entries will be accomplished by below SQL:
Subscribe to:
Post Comments (Atom)
Windows (powershell) counterparts of Linux commands
You may find Windows mostly powershell equivalent of frequently used Linux commands here. I will update this post, with newer ones by the ti...
-
Simple but very handy way of taking exports without knowing database password. expdp \'/ as sysdba\' DIRECTORY=DATA_PUMP_DIR DUMPFIL...
-
It seems an easy topic, I prefer being careful with the easy ones... so it is handy to have all you need, in one place and which will lead i...
-
This directory is created in the home directory of the oracle user, after making a local connection. I had tested it as follows: [oracle@geo...
1 comment:
Performans bakış açısı ile bir örnek;
SELECT /*+ parallel */
year_month, COUNT(subscriber_id), COUNT(DISTINCT subscriber_id)
FROM eul.dm_subscriber_history_01 t
GROUP BY year_month;
200711 60791328 60261922
avrupa@oracle $ cat tmp.sql
spool tmp.log
conn eul/eul
SET LINESIZE 2000
SET AUTOTRACE TRACEONLY
SET TIME ON
SET TIMING ON
ALTER SESSION ENABLE PARALLEL DDL ;
ALTER SESSION ENABLE PARALLEL DML ;
alter session set max_dump_file_size=unlimited ;
alter session set timed_statistics = true ;
alter session set STATISTICS_LEVEL = ALL ;
alter session set "_rowsource_execution_statistics" = true ;
alter session set tracefile_identifier = duplicates ;
SELECT /*+ parallel */
COUNT(*)
FROM eul.dm_subscriber_history_01
WHERE ROWID NOT IN (SELECT /*+ parallel */ a.rid
FROM (SELECT /*+ parallel */
MAX(ROWID) rid, year_month, subscriber_id, COUNT(1)
FROM eul.dm_subscriber_history_01
GROUP BY year_month, subscriber_id
HAVING COUNT(1) > 1) a);
/*
Elapsed: 00:08:37.55
1886131 consistent gets
2005144 physical reads
*/
SELECT /*+ parallel */
COUNT(*)
FROM eul.dm_subscriber_history_01
WHERE ROWID IN (SELECT /*+ parallel */ rid
FROM (SELECT /*+ parallel */
ROWID rid,
row_number() over(PARTITION BY year_month, subscriber_id ORDER BY ROWID) rn
FROM eul.dm_subscriber_history_01)
WHERE rn 1);
/*
Elapsed: 00:03:12.03
1472472 consistent gets
1067141 physical reads
*/
exit;
Silinecek kayıt miktarı tüm kayıtların bence ~%10 civarına ulaşınca direkt CTAS nologging paralel girmek daha doğru olabilir.
Kaynakça : http://www.oracle.com/technology/oramag/oracle/04-jul/o44asktom.html
Post a Comment