sqlldr 导数据

sqlldr 可以把文本文件导入到数据库里。

命令行命令:
sqlldr userid=dbUserName/dbName control=sqlldr.ctl log=sqlldr.log bad=sqlldr.bad bindsize=1048576000 rows=500000 readsize=209715200 multithreading=TRUE direct=TRUE;

sqlldr.ctl 是导入的控制文件:

load data                  
characterset UTF8   #   需要加载中文字符时应在控制文件里指定字符集
infile csv.txt2           #  要导入的数据文件
into table scott.caiyunlog append          #  导入的目标表,导入方式是append 追加,还有其他方式见下面
(
        ip terminated  by '|',               #   每个字段可以指定自己的分隔符
        rtime terminated  by '|',
        method terminated  by '|',
        uri terminated  by '|',
        proto terminated  by '|',
        code terminated  by '|',
        respsize terminated  by '|',
        T terminated  by '|',
        D terminated by  whitespace     #  最后的那个字段没有分隔符,就用空白符,以最后的换行符为分隔符。
)

导入方式

  1. insert –为缺省方式,在数据装载开始时要求表为空
  2. append –在表中追加新记录
  3. replace –删除旧记录(用 delete from table 语句),替换成新装载的记录
  4. truncate –删除旧记录(用 truncate table 语句),替换成新装载的记录

sqlldr用法

要注意是errors参数,当错误的数据行数达到这个参数的值时会终止导入。

Usage: SQLLDR keyword=value [,keyword=value,...]

Valid Keywords:

    userid -- ORACLE username/password           
   control -- control file name                  
       log -- log file name                      
       bad -- bad file name                      
      data -- data file name                     
   discard -- discard file name                  
discardmax -- number of discards to allow          (Default all)
      skip -- number of logical records to skip    (Default 0)
      load -- number of logical records to load    (Default all)
    errors -- number of errors to allow            (Default 50)
      rows -- number of rows in conventional path bind array or between direct path data saves
               (Default: Conventional path 64, Direct path all)
  bindsize -- size of conventional path bind array in bytes  (Default 256000)
    silent -- suppress messages during run (header,feedback,errors,discards,partitions)
    direct -- use direct path                      (Default FALSE)
   parfile -- parameter file: name of file that contains parameter specifications
  parallel -- do parallel load                     (Default FALSE)
      file -- file to allocate extents from      
skip_unusable_indexes -- disallow/allow unusable indexes or index partitions  (Default FALSE)
skip_index_maintenance -- do not maintain indexes, mark affected indexes as unusable  (Default FALSE)
commit_discontinued -- commit loaded rows when load is discontinued  (Default FALSE)
  readsize -- size of read buffer                  (Default 1048576)
external_table -- use external table for load; NOT_USED, GENERATE_ONLY, EXECUTE  (Default NOT_USED)
columnarrayrows -- number of rows for direct path column array  (Default 5000)
streamsize -- size of direct path stream buffer in bytes  (Default 256000)
multithreading -- use multithreading in direct path  
resumable -- enable or disable resumable for current session  (Default FALSE)
resumable_name -- text string to help identify resumable statement
resumable_timeout -- wait time (in seconds) for RESUMABLE  (Default 7200)
date_cache -- size (in entries) of date conversion cache  (Default 1000)
no_index_errors -- abort load on any index errors  (Default FALSE)

PLEASE NOTE: Command-line parameters may be specified either by
position or by keywords.  An example of the former case is 'sqlldr
scott/tiger foo'; an example of the latter is 'sqlldr control=foo
userid=scott/tiger'.  One may specify parameters by position before
but not after parameters specified by keywords.  For example,
'sqlldr scott/tiger control=foo logfile=log' is allowed, but
'sqlldr scott/tiger control=foo log' is not, even though the
position of the parameter 'log' is correct.


欢迎关注我的微信公众号: coderbee笔记,可以更及时回复你的讨论。