Skip to content

fix incorrect split for select query #754

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
Show file tree
Hide file tree
Changes from 6 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 2 additions & 2 deletions packages/app/src/DBSearchPage.tsx
Original file line number Diff line number Diff line change
Expand Up @@ -26,7 +26,7 @@ import {
DisplayType,
Filter,
} from '@hyperdx/common-utils/dist/types';
import { splitAndTrimCSV } from '@hyperdx/common-utils/dist/utils';
import { splitAndTrimWithBracket } from '@hyperdx/common-utils/dist/utils';
import {
ActionIcon,
Box,
Expand Down Expand Up @@ -884,7 +884,7 @@ function DBSearchPage() {
};
}, [chartConfig, searchedTimeRange]);

const displayedColumns = splitAndTrimCSV(
const displayedColumns = splitAndTrimWithBracket(
dbSqlRowTableConfig?.select ??
searchedSource?.defaultTableSelectExpression ??
'',
Expand Down
6 changes: 3 additions & 3 deletions packages/app/src/components/DBRowTable.tsx
Original file line number Diff line number Diff line change
Expand Up @@ -15,7 +15,7 @@ import {
ChartConfigWithDateRange,
SelectList,
} from '@hyperdx/common-utils/dist/types';
import { splitAndTrimCSV } from '@hyperdx/common-utils/dist/utils';
import { splitAndTrimWithBracket } from '@hyperdx/common-utils/dist/utils';
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Granularity and splitAndTrimCSV is not used at here anymore, remove it since it cause conflict.

import { Box, Code, Flex, Text } from '@mantine/core';
import { FetchNextPageOptions } from '@tanstack/react-query';
import {
Expand Down Expand Up @@ -696,10 +696,10 @@ function mergeSelectWithPrimaryAndPartitionKey(
.map(k => extractColumnReference(k.trim()))
.filter((k): k is string => k != null && k.length > 0);
const primaryKeyArr =
primaryKeys.trim() !== '' ? splitAndTrimCSV(primaryKeys) : [];
primaryKeys.trim() !== '' ? splitAndTrimWithBracket(primaryKeys) : [];
const allKeys = [...partitionKeyArr, ...primaryKeyArr];
if (typeof select === 'string') {
const selectSplit = splitAndTrimCSV(select);
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the root cause, make input:
JSONExtractString(Body, 'object', 'c'),JSONExtractString(Body, 'object', 'note')
become
["JSONExtractString(Body", "'object'", "'c')", "'note')"]
after dedupe set(selectColumns).

Copy link
Member

@wrn14897 wrn14897 Apr 16, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good find! have we thought about using module like csv-parse? (https://www.npmjs.com/package/csv-parse)

Copy link
Author

@LYHuang LYHuang Apr 17, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I checked the csv-parse, however it does not work well with parentheses and brackets.
Parsing option quote only works for same character such as: "content", and it must at the beginning of the string, ex: "good", err"or".

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

got it. in this case, I think we want to make sure the unit tests cover all use cases. Does it work with the select field like "foo,bar",field2,field3 as well?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It will split it to:

[
  '"foo',
  'bar"',
  'field2',
  'field3'
]

It should works fine if we don't get the input like: "foo,bar","foo,differentBar",field2,field3 and also doing the dedupe.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will work on quote stuff, will update PR later.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cool, I'd imagine the return value from the example above is ["foo,bar", "field2", "field3"] although it's a corner case

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

update the split function to support single quote and double quote, feel free to check test case.

const selectSplit = splitAndTrimWithBracket(select);
const selectColumns = new Set(selectSplit);
const additionalKeys = allKeys.filter(k => !selectColumns.has(k));
return {
Expand Down
11 changes: 7 additions & 4 deletions packages/app/src/source.ts
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,10 @@ import {
JSDataType,
} from '@hyperdx/common-utils/dist/clickhouse';
import { MetricsDataType, TSource } from '@hyperdx/common-utils/dist/types';
import { hashCode, splitAndTrimCSV } from '@hyperdx/common-utils/dist/utils';
import {
hashCode,
splitAndTrimWithBracket,
} from '@hyperdx/common-utils/dist/utils';
import { useMutation, useQuery, useQueryClient } from '@tanstack/react-query';

import { hdxServer } from '@/api';
Expand Down Expand Up @@ -43,7 +46,7 @@ function getLocalSources(): TSource[] {
// If a user specifies a timestampValueExpression with multiple columns,
// this will return the first one. We'll want to refine this over time
export function getFirstTimestampValueExpression(valueExpression: string) {
return splitAndTrimCSV(valueExpression)[0];
return splitAndTrimWithBracket(valueExpression)[0];
}

export function getSpanEventBody(eventModel: TSource) {
Expand All @@ -65,7 +68,7 @@ export function getEventBody(eventModel: TSource) {
: undefined) ??
eventModel.implicitColumnExpression; //??
// (eventModel.kind === 'log' ? 'Body' : 'SpanName')
const multiExpr = splitAndTrimCSV(expression ?? '');
const multiExpr = splitAndTrimWithBracket(expression ?? '');
return multiExpr.length === 1 ? expression : multiExpr[0]; // TODO: check if we want to show multiple columns
}

Expand Down Expand Up @@ -215,7 +218,7 @@ export async function inferTableSourceConfig({
connectionId,
})
).primary_key;
const keys = splitAndTrimCSV(primaryKeys);
const keys = splitAndTrimWithBracket(primaryKeys);

const isOtelLogSchema = hasAllColumns(columns, [
'Timestamp',
Expand Down
141 changes: 140 additions & 1 deletion packages/common-utils/src/__tests__/utils.test.ts
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
import { splitAndTrimCSV } from '../utils';
import { splitAndTrimCSV, splitAndTrimWithBracket } from '../utils';

describe('utils', () => {
describe('splitAndTrimCSV', () => {
Expand Down Expand Up @@ -26,4 +26,143 @@ describe('utils', () => {
expect(splitAndTrimCSV(',, ,,')).toEqual([]);
});
});

describe('splitAndTrimWithBracket', () => {
it('should split a simple comma-separated string', () => {
const input = 'column1, column2, column3';
const expected = ['column1', 'column2', 'column3'];
expect(splitAndTrimWithBracket(input)).toEqual(expected);
});

it('should handle function calls with commas in parameters', () => {
const input =
"Timestamp, ServiceName, JSONExtractString(Body, 'c'), JSONExtractString(Body, 'msg')";
const expected = [
'Timestamp',
'ServiceName',
"JSONExtractString(Body, 'c')",
"JSONExtractString(Body, 'msg')",
];
expect(splitAndTrimWithBracket(input)).toEqual(expected);
});

it('should handle nested function calls', () => {
const input = 'col1, func1(a, b), col2, func2(c, func3(d, e)), col3';
const expected = [
'col1',
'func1(a, b)',
'col2',
'func2(c, func3(d, e))',
'col3',
];
expect(splitAndTrimWithBracket(input)).toEqual(expected);
});

it('should handle square brackets in column expressions', () => {
const input = "col1, array[1, 2, 3], jsonb_path_query(data, '$[*]')";
const expected = [
'col1',
'array[1, 2, 3]',
"jsonb_path_query(data, '$[*]')",
];
expect(splitAndTrimWithBracket(input)).toEqual(expected);
});

it('should handle mixed parentheses and square brackets', () => {
const input = "col1, func(array[1, 2], obj['key']), col2['nested'][0]";
const expected = [
'col1',
"func(array[1, 2], obj['key'])",
"col2['nested'][0]",
];
expect(splitAndTrimWithBracket(input)).toEqual(expected);
});

it('should trim whitespace from resulting columns', () => {
const input = ' col1 , func(a, b) , col2 ';
const expected = ['col1', 'func(a, b)', 'col2'];
expect(splitAndTrimWithBracket(input)).toEqual(expected);
});

it('should handle empty input', () => {
expect(splitAndTrimWithBracket('')).toEqual([]);
});

it('should handle input with only spaces', () => {
expect(splitAndTrimWithBracket(' ')).toEqual([]);
});

it('should skip empty elements', () => {
const input = 'col1,,col2, ,col3';
const expected = ['col1', 'col2', 'col3'];
expect(splitAndTrimWithBracket(input)).toEqual(expected);
});

it('should handle quoted strings with commas', () => {
const input = "col1, concat('Hello, World!'), col2";
const expected = ['col1', "concat('Hello, World!')", 'col2'];
expect(splitAndTrimWithBracket(input)).toEqual(expected);
});

it('should handle double quoted strings with commas', () => {
const input = 'col1, "quoted, string", col3';
const expected = ['col1', '"quoted, string"', 'col3'];
expect(splitAndTrimWithBracket(input)).toEqual(expected);
});

it('should handle single quoted strings with commas', () => {
const input = `col1, 'quoted, string', col3`;
const expected = ['col1', `'quoted, string'`, 'col3'];
expect(splitAndTrimWithBracket(input)).toEqual(expected);
});

it('should handle mixed quotes with commas', () => {
const input = `col1, "double, quoted", col2, 'single, quoted', col3`;
const expected = [
'col1',
`"double, quoted"`,
'col2',
`'single, quoted'`,
'col3',
];
expect(splitAndTrimWithBracket(input)).toEqual(expected);
});

it('should handle quotes inside function calls', () => {
const input = 'col1, func("text with , comma", \'another, text\'), col2';
const expected = [
'col1',
'func("text with , comma", \'another, text\')',
'col2',
];
expect(splitAndTrimWithBracket(input)).toEqual(expected);
});

it('should handle brackets inside quoted strings', () => {
const input =
'col1, "string with (brackets, inside)", col2, \'string with [brackets, inside]\', col3';
const expected = [
'col1',
'"string with (brackets, inside)"',
'col2',
"'string with [brackets, inside]'",
'col3',
];
expect(splitAndTrimWithBracket(input)).toEqual(expected);
});

it('should handle real-world SQL column list example', () => {
const input =
"Timestamp, ServiceName, JSONExtractString(Body, 'c'), JSONExtractString(Body, 'msg'), Timestamp, \"foo, bar\"";
const expected = [
'Timestamp',
'ServiceName',
"JSONExtractString(Body, 'c')",
"JSONExtractString(Body, 'msg')",
'Timestamp',
'"foo, bar"',
];
expect(splitAndTrimWithBracket(input)).toEqual(expected);
});
});
});
6 changes: 4 additions & 2 deletions packages/common-utils/src/queryParser.ts
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@ import SqlString from 'sqlstring';

import { convertCHTypeToPrimitiveJSType, JSDataType } from '@/clickhouse';
import { Metadata } from '@/metadata';
import { splitAndTrimCSV } from '@/utils';
import { splitAndTrimWithBracket } from '@/utils';

function encodeSpecialTokens(query: string): string {
return query
Expand Down Expand Up @@ -550,7 +550,9 @@ export class CustomSchemaSQLSerializerV2 extends SQLSerializer {
);
}

const expressions = splitAndTrimCSV(this.implicitColumnExpression);
const expressions = splitAndTrimWithBracket(
this.implicitColumnExpression,
);

return {
column:
Expand Down
4 changes: 2 additions & 2 deletions packages/common-utils/src/renderChartConfig.ts
Original file line number Diff line number Diff line change
Expand Up @@ -25,7 +25,7 @@ import {
import {
convertDateRangeToGranularityString,
getFirstTimestampValueExpression,
splitAndTrimCSV,
splitAndTrimWithBracket,
} from '@/utils';

// FIXME: SQLParser.ColumnRef is incomplete
Expand Down Expand Up @@ -424,7 +424,7 @@ async function timeFilterExpr({
with?: ChartConfigWithDateRange['with'];
includedDataInterval?: string;
}) {
const valueExpressions = splitAndTrimCSV(timestampValueExpression);
const valueExpressions = splitAndTrimWithBracket(timestampValueExpression);
const startTime = dateRange[0].getTime();
const endTime = dateRange[1].getTime();

Expand Down
53 changes: 52 additions & 1 deletion packages/common-utils/src/utils.ts
Original file line number Diff line number Diff line change
Expand Up @@ -18,10 +18,61 @@ export function splitAndTrimCSV(input: string): string[] {
.filter(column => column.length > 0);
}

// Replace splitAndTrimCSV, should remove splitAndTrimCSV later
export function splitAndTrimWithBracket(input: string): string[] {
let parenCount: number = 0;
let squareCount: number = 0;
let inSingleQuote: boolean = false;
let inDoubleQuote: boolean = false;

const res: string[] = [];
let cur: string = '';
for (const c of input + ',') {
if (c === '"' && !inSingleQuote) {
inDoubleQuote = !inDoubleQuote;
cur += c;
continue;
}

if (c === "'" && !inDoubleQuote) {
inSingleQuote = !inSingleQuote;
cur += c;
continue;
}
// Only count brackets when not in quotes
if (!inSingleQuote && !inDoubleQuote) {
if (c === '(') {
parenCount++;
} else if (c === ')') {
parenCount--;
} else if (c === '[') {
squareCount++;
} else if (c === ']') {
squareCount--;
}
}

if (
c === ',' &&
parenCount === 0 &&
squareCount === 0 &&
!inSingleQuote &&
!inDoubleQuote
) {
const trimString = cur.trim();
if (trimString) res.push(trimString);
cur = '';
} else {
cur += c;
}
}
return res;
}

// If a user specifies a timestampValueExpression with multiple columns,
// this will return the first one. We'll want to refine this over time
export function getFirstTimestampValueExpression(valueExpression: string) {
return splitAndTrimCSV(valueExpression)[0];
return splitAndTrimWithBracket(valueExpression)[0];
}

export enum Granularity {
Expand Down