How do I load text or csv file data into SQL Server?

本文介绍如何使用 SQL Server 的 BULK INSERT 命令从 CSV 或其他格式文件加载数据到数据库表中。包括指定分隔符、忽略标题行及设置错误阈值等实用技巧。
If you need to load data into SQL Server (e.g. from log files, csv files, chat transcripts etc), then chances are, you're going to be making good friends with the BULK INSERT command. 
 
You can specify row and column delimiters, making it easy to import files in comma-separated values (CSV), Tab-separated values (TSV), or any other delimiter (e.g. the pipe character is commonly used, since it rarely exists in valid data). You can also tell it to skip any number of header rows (these are fairly common in many log file formats). 
 
So, let's run through a few examples. We have this table: 
 
 
And let's say we have this CSV file, saced as c:/file1.csv: 
 

 
The command to bulk insert this data would be: 
 

 
Now, if we had tab-separated values, it would look like this: 
 

 
Note that a row delimiter often has to be experimented with, because often log files produce "columns" in the table by putting the value and then the delimiter. So, in many cases, you might actually be looking to separate rows by column delimiter + row delimiter. So, the last column has a dangling column delimiter that isn't really necessary (but we have to deal with it). I currently have processes that look like this, to deal with files that have trailing tab characters after every column: 
 

 
And earlier I mentioned the case where you have header rows that are almost never going to match the data types of the target table (and you wouldn't want the headers mixed in with the data, even if all the data types were character). So, let's say the CSV file actually looked like this: 
 

 
If you try the BULK INSERT command from above, you will get this error: 
 

 
This is because the first row doesn't contain valid data, and the insert fails. We could tell BULK INSERT to ignore the first row in the data file, by specifying the starting row with the FIRSTROW parameter: 
 

 
Finally, you can also specify how many errors you want to allow before considering that the BULK INSERT failed. We can never really be confident that a log file from some external source will always have "perfect" data in it. Failures can occur when you have character data that is too big for the column in the table, or if you have a string in an integer column, or if your date is malformed, or several other potential scenarios. So, you probably want some threshold where more than n rows will cause the BULK INSERT to fail. You can specify this limit using the MAXERRORS parameter. (I've already requested for Yukon that they add a MAXERRORPERCENT so that there will be an appropriate failure ratio for both large and small files.) 
 

 
You can also use bcp to move data from a flat file into an existing table. BULK INSERT and the bcp utility are documented in Books Online.
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值