csv 笔记

csv初使用

COMMON-CSV官网

项目中批量插入数据库使用的mybatis,在做数据导入的时候,觉得是mybatis的批量导入有问题,后来发现其实是没问题的。当时没看时间花费多少,就找到了common-csv来做测试,在知乎上也有很多人说用csv配合load data会很快,后来测试之后,其实感觉是差不多的,我的数据量是1000左右,可能在1w以上的数据就会出现差异,load data 这个东西感觉是效率很高,使用起来也是很方便,在项目中引用common-csv的jar包即可

示例

  • mybatis xml 文件配置

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    <insert id="createOrdersByCsv"   >
    LOAD DATA LOCAL INFILE #{filePath}
    INTO TABLE wo_work_order_test
    FIELDS TERMINATED BY ','
    lines terminated by '\r\n'
    (ID, ENTERPRISE_ID, ORDER_NO,CUSTOMER_ID,
    STATUS,PRIORITY,SUBJECT,CONTENT,SOURCE,WORKGROUP,AGENT,IS_DELETE,
    STRATEGY,HAS_FILES,DISPATCH_STATUS,BATCH_NO,CREATE_TIME,
    UPDATE_TIME,OVER_TIME,HANDING_OVER_TIME,OVER_TIME_FLAG,
    HANDING_OVER_TIME_FLAG,WG_RECEIVE_TIME,AG_RECEIVE_TIME);
    </insert>
  • java 代码生成csv文件

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    22
    23
    24
    25
    26
    27
    28
    29
    30
    31
    32
    33
    34
    35
    36
    37
    38
    39
    40
    41
    42
    43
    44
    45
    46
    47
    48
    49
    50
    51
    52
    53
    54
    55
    56
    57
    58
    59
    60
    61
    62
    63
    64
    65
    66
    67
    68
    69
    70
    71
    72
    73
    74
    75
    76
    77
    78
    79
    80
    81
    82
    83
    84
    85
    86
    87
    88
    89
    90
    /**
    * 通过Csv文件导入mySql数据库
    * @param entities 实体list
    * @return boolean 导入结果
    */
    private boolean createOrdersByCsv(List<OrderEntity>entities){
    boolean result=false;
    String filename = "/usr/local/jiaxin_gw_container-1.0/tmp/"+TimeUtils.getCurrentTimeStamp()+"orders-tmp.csv";
    try {
    CSVFormat csvFileFormat = CSVFormat.RFC4180.withRecordSeparator("\n");// 创建CSVFormat

    OutputStreamWriter write = new OutputStreamWriter(new FileOutputStream(filename),"UTF-8");
    BufferedWriter bufferedWriter = new BufferedWriter(write);
    CSVPrinter csvFilePrinter = new CSVPrinter(bufferedWriter, csvFileFormat);
    StringBuilder recordStr = new StringBuilder();

    for (OrderEntity entity : entities) {
    recordStr.append(entity.getID() + ",");
    recordStr.append(entity.getEnterpriseID() + ",");
    recordStr.append(entity.getOrderNo() + ",");
    if(entity.getCustomerJID()==null){
    recordStr.append("\\N" + ",");
    }else{
    recordStr.append(entity.getCustomerJID() + ",");
    }

    recordStr.append(entity.getStatus() + ",");
    recordStr.append(entity.getPriority() + ",");
    if(entity.getSubject()==null){
    recordStr.append("\\N"+ ",");
    }else{
    recordStr.append(entity.getSubject() + ",");
    }
    if(entity.getDescription()==null){
    recordStr.append("\\N" + ",");
    }else{
    recordStr.append(entity.getDescription() + ",");
    }

    recordStr.append(entity.getSource() + ",");

    if(entity.getAcceptWkgroupJID()==null){
    recordStr.append( "\\N"+ ",");
    }else{
    recordStr.append(entity.getAcceptWkgroupJID()+ ",");
    }

    if(entity.getAcceptAgentJID()==null){
    recordStr.append("\\N"+ ",");
    }else{
    recordStr.append(entity.getAcceptAgentJID() + ",");
    }

    recordStr.append(entity.getIsDelete() + ",");
    recordStr.append(entity.getDispatchStrategy() + ",");
    recordStr.append(entity.getHasAttach()+ ",");
    recordStr.append(entity.getDispatchStatus()+ ",");

    recordStr.append(entity.getBatchNo()+ ",");
    recordStr.append(entity.getCreateTime()+ ",");
    recordStr.append(entity.getUpdateTime()+ ",");
    if(entity.getTimeoutTime()==null){
    recordStr.append("\\N"+ ",");
    }else{
    recordStr.append(entity.getTimeoutTime()+ ",");
    }
    if(entity.getHandingOverTime()==null){
    recordStr.append("\\N"+ ",");
    }else{
    recordStr.append(entity.getHandingOverTime()+ ",");
    }

    recordStr.append(entity.getOverTimeFlag()+ ",");
    recordStr.append(entity.getHandingOverTimeFlag());
    recordStr.append("\r\n");
    }

    recordStr.deleteCharAt(0);
    recordStr.deleteCharAt(recordStr.length()-1);
    csvFilePrinter.printRecord(recordStr.toString());
    recordStr.delete(0, recordStr.length()-1);
    csvFilePrinter.close();
    orderDao.createOrdersByCsv(filename);
    result=true;
    } catch (Exception e) {
    // TODO Auto-generated catch block
    e.printStackTrace();
    }
    return result;
    }

上面的\N是null的String插入到mysql为null,如果不使用\N,插入到数据库就是null的字符串,显然不对。

总结

在使用过程中,生成csv文件中,产生的头尾双引号,还没有找到解决方法,所以导致插入数据库是有双引号的。
需要在api文档中找寻相关解决方法。这是一个问题。
还有就是分隔符的问题,导入的数据要是比较正常的数据还好,如果出现跟分割符相同的数据,会直接导致插入的数据90%的错误,这个是很严重的问题,所以用的时候要注意。

wenthywang wechat
if you have any problems, you can contact me by my public wechat account.
other,please leave your email address.
无论怎么样 我都会继续Share我的一点一滴!