Skip to content

Commit

Permalink
Ran static analysis and cleaned up a few issues
Browse files Browse the repository at this point in the history
  • Loading branch information
ChrisMcGowanAu committed Nov 5, 2024
1 parent e662807 commit 9dedec9
Show file tree
Hide file tree
Showing 4 changed files with 34 additions and 20 deletions.
31 changes: 26 additions & 5 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,9 +4,26 @@

A parser for csv files in C, for C and C++

This is a RFC 4180 parser for reading csv files for C.
This is an RFC 4180 parser for reading csv files for C and C++.
It also handles csv files created by Excel, including multi line cells.
It can handle csv files with multi-line cells, and handles different new line conventions.
The latest change allows it to read very large csv files ( tested to 500,000 rows )

The csv file is read into a linked list of rows, each row has a linked list of cells for that row. After creation of this list, an array of row pointers is created to allow fast lookup.

This allows each cell be access directly and in any order.

//////////////////////////////////////
// The rows are a linked list
// Each row has a linked list of cells
// R - C - C
// |
// R - C -C - C - C - C
// |
// R - C -C - C - C
// |
// etc
//////////////////////////////////////


There are just 3 functions, one to read the file and one to get access to the data in each cell, and one to free the memory used when done. It compiles fine using C or C++, so it can be used for C and C++.

Expand All @@ -16,8 +33,11 @@ cell = getCell(csvPtr, row, column);

void freeMem(csvPtr);

See csvTest.c for a C example of reading and recreating the csv file to stdout.

For the C++ version, There is a simple class defined

See csvTest.cpp for a C++ example of reading and recreating the csv file to stdout.

I have written several csv parsers over the years. These parsers used the troublesome C functions 'strsep' and 'strtok'
either I was using them wrong or they are buggy under a high load, they also behaved in different ways on different machines
Expand All @@ -29,13 +49,15 @@ Whilst it is very fast, It should read files of any size, limited by how much me
I used to roll my own linked lists before the C++ STL came along, and it was fun for me to write code using linked lists again.

I have tried to make it RFC 4180 compliant.
This repository consists of three files
This repository consists of two files
csvParser.h
csvParser.c

-- for testing and an example
csvTest.c

The third file csvTest.c can be used as an example and for testing
The csv file is read by the this function
The csv file is read by this function

CsvType *csv = readCsv(filename, csvSeperator);
Cells can be acceses via this function
Expand All @@ -46,7 +68,6 @@ csvParser.h
csvParser.cpp (This is the same file as csvParser.c)
csvTest.cpp


The csv file is read by the this function
CsvType *csv = readCsv(filename, csvSeperator);
Cells can be acceses via this function
Expand Down
19 changes: 6 additions & 13 deletions csvParser.c
Original file line number Diff line number Diff line change
Expand Up @@ -123,22 +123,15 @@ CsvCellType getCell(CsvType *csv, uint32_t row, uint32_t col) {
cell.status = emptyCell;
cell.lastCellInRow = true;
cell.bytes = 0;
cell.cellContents = nullptr;
RowType *rowPtr = csv->rowLookup[row];
/*
**/
uint32_t rowNumber = 0;
if (rowPtr == nullptr) {
if (DEBUGME > 1)
fprintf(stderr, "No Rows defined\n");
cell.status = missingRow;
return (cell);
}
/*
while (rowPtr->next != nullptr && rowNumber != row) {
rowPtr = rowPtr->next;
rowNumber++;
}
**/
rowNumber = rowPtr->rowId;
if (rowNumber != row) {
if (DEBUGME > 0)
Expand Down Expand Up @@ -356,7 +349,7 @@ uint32_t countAltDquotes(char *buffer) {
// on very large csv files.
////////////////////////////////////////////////////
void buildRowIndex(CsvType *csv) {
csv->rowLookup = (RowType **)malloc((csv->numRows + 1) * sizeof(uint64_t));
csv->rowLookup = (RowType **)malloc((csv->numRows + 1) * sizeof(RowType*));
RowType *rowPtr = csv->firstRow;
RowType *nextPtr = nullptr;
uint32_t rowIndex = 0;
Expand All @@ -377,14 +370,14 @@ void buildRowIndex(CsvType *csv) {
// Read the csv file
////////////////////////////////////////////////////
CsvType *readCsv(char *filename, char sep) {
FILE *fp = NULL;
FILE *fp = nullptr;
CsvType *csv = (CsvType *)malloc(sizeof(CsvType));
bzero((void *)csv, sizeof(CsvType));
memset((void *)csv, 0, sizeof(struct CsvType));
fp = fopen(filename, "r");
if (fp != NULL) {
if (fp != nullptr) {
uint32_t lines = 0;
char buffer[LINEMAX];
bzero((void *)buffer, sizeof(buffer));
memset((void *) buffer, 0, sizeof(buffer));
uint32_t startIdx = 0;
while (fgets(&buffer[startIdx], LINEMAX, fp) != nullptr) {
if (startIdx > 0) {
Expand Down
2 changes: 1 addition & 1 deletion csvParser.h
Original file line number Diff line number Diff line change
Expand Up @@ -81,7 +81,7 @@ void freeMem(CsvType *csv);
}
#endif
// For C++ only
// Class difinitions
// Class definitions
#ifdef __cplusplus
class CsvClass {
public:
Expand Down
2 changes: 1 addition & 1 deletion csvTest.c
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
#include "csvParser.h"
#include <stdio.h>
#include <unistd.h>
#include "csvParser.h"

int main(int argc, char **argv) {
if (argc < 2) {
Expand Down

0 comments on commit 9dedec9

Please sign in to comment.