Skip to content
5 changes: 3 additions & 2 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -236,8 +236,9 @@ go get -u github.com/hackafterdark/carta

## Important Notes

Carta removes any duplicate rows. This is a side effect of the data mapping as it is unclear which object to instantiate if the same data arrives more than once.
If this is not a desired outcome, you should include a uniquely identifiable columns in your query and the corresponding fields in your structs.
When mapping to **slices of structs**, Carta removes duplicate entities. This is a side effect of the data mapping process, which merges rows that identify the same entity (e.g., a `Blog` with the same ID appearing in multiple rows due to a `JOIN`). To ensure correct mapping, you should always include uniquely identifiable columns (like a primary key) in your query for each struct entity.

When mapping to **slices of basic types** (e.g., `[]string`, `[]int`), every row from the query is treated as a unique element, and **no de-duplication occurs**.

To prevent relatively expensive reflect operations, carta caches the structure of your struct using the column mames of your query response as well as the type of your struct.

Expand Down
37 changes: 37 additions & 0 deletions ai_plans/FIX_DUPLICATE_ROWS.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,37 @@
# Problem: Incorrect De-duplication When Mapping to Basic Slices

## Summary
The `carta` library is designed to de-duplicate entities when mapping SQL rows to slices of structs (e.g., `[]User`). This is achieved by generating a unique ID for each entity based on the content of its primary key columns. This behavior is correct for handling `JOIN`s where a single entity might appear across multiple rows.

However, this same logic is incorrectly applied when the mapping destination is a slice of a basic type (e.g., `[]string`, `[]int`). In this scenario, rows with duplicate values are treated as the same entity and are de-duplicated, which is incorrect. The desired behavior is to preserve every row from the result set, including duplicates.

This issue is the root cause for the following problems:
1. The `if m.IsBasic` code path in `load.go` lacks test coverage because no tests exist for mapping to basic slices.
2. Attempts to write such tests lead to infinite loops and incorrect behavior because the column allocation and unique ID generation logic are not designed to handle this case.

## Proposed Solution
The solution is to create a distinct execution path for "basic mappers" (`m.IsBasic == true`) that ensures every row is treated as a unique element.

This will be accomplished in two main steps:

### 1. Fix Column Allocation (`allocateColumns`)
The logic will be modified to enforce a clear rule for basic slices: the source SQL query must return **exactly one column**.

- If `m.IsBasic` is true, the function will bypass the existing name-matching logic.
- It will validate that only one column is present in the query result.
- This single column will be assigned as the `PresentColumn` for the mapper.
- If more than one column is found, the function will return an error to prevent ambiguity.

### 2. Fix Unique ID Generation (`loadRow`)
The logic will be modified to generate a unique ID based on the row's position rather than its content.

- If `m.IsBasic` is true, the call to `getUniqueId(row, m)` will be bypassed.
- A new, position-based unique ID will be generated for each row (e.g., using a simple counter that increments with each row processed).
- This ensures that every row, regardless of its content, is treated as a distinct element to be added to the destination slice.

This approach preserves the existing, correct behavior for struct mapping while introducing a new, robust path for handling basic slices correctly.

## Plan
1. **Modify `column.go`**: Update the `allocateColumns` function to implement the single-column rule for basic mappers.
2. **Modify `load.go`**: Update the `loadRow` function to use a position-based counter for unique ID generation when `m.IsBasic` is true.
3. **Add Tests**: Create a new test case in `mapper_test.go` that maps a query result to a slice of a basic type (e.g., `[]string`) to validate the fix and provide coverage for the `m.IsBasic` code path.
18 changes: 10 additions & 8 deletions column.go
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,7 @@ package carta

import (
"database/sql"
"fmt"
"sort"
"strings"
)
Expand All @@ -17,16 +18,17 @@ type column struct {
func allocateColumns(m *Mapper, columns map[string]column) error {
presentColumns := map[string]column{}
if m.IsBasic {
candidates := getColumnNameCandidates("", m.AncestorNames, m.Delimiter)
if len(columns) != 1 {
return fmt.Errorf("carta: when mapping to a slice of a basic type, the query must return exactly one column (got %d)", len(columns))
}
for cName, c := range columns {
if _, ok := candidates[cName]; ok {
presentColumns[cName] = column{
typ: c.typ,
name: cName,
columnIndex: c.columnIndex,
}
delete(columns, cName) // dealocate claimed column
presentColumns[cName] = column{
typ: c.typ,
name: cName,
columnIndex: c.columnIndex,
}
delete(columns, cName)
break
}
Comment thread
coderabbitai[bot] marked this conversation as resolved.
Outdated
} else {
for i, field := range m.Fields {
Expand Down
19 changes: 13 additions & 6 deletions load.go
Original file line number Diff line number Diff line change
Expand Up @@ -2,9 +2,9 @@ package carta

import (
"database/sql"
"errors"
"fmt"
"reflect"
"strconv"

"github.com/hackafterdark/carta/value"
)
Expand All @@ -18,16 +18,18 @@ func (m *Mapper) loadRows(rows *sql.Rows, colTyps []*sql.ColumnType) (*resolver,
colTypNames[i] = colTyps[i].DatabaseTypeName()
}
rsv := newResolver()
rowCount := 0
for rows.Next() {
for i := 0; i < len(colTyps); i++ {
row[i] = value.NewCell(colTypNames[i])
}
if err = rows.Scan(row...); err != nil {
return nil, err
}
if err = loadRow(m, row, rsv); err != nil {
if err = loadRow(m, row, rsv, rowCount); err != nil {
return nil, err
}
rowCount++
}
if err := rows.Err(); err != nil {
return nil, err
Expand Down Expand Up @@ -56,16 +58,21 @@ func (m *Mapper) loadRows(rows *sql.Rows, colTyps []*sql.ColumnType) (*resolver,
// for example, if a blog has many Authors
//
// rows are actually []*Cell, theu are passed here as interface since sql scan requires []interface{}
func loadRow(m *Mapper, row []interface{}, rsv *resolver) error {
func loadRow(m *Mapper, row []interface{}, rsv *resolver, rowCount int) error {
var (
err error
dstField reflect.Value // destination field to be set with
cell *value.Cell
elem *element
found bool
uid uniqueValId
)

uid := getUniqueId(row, m)
if m.IsBasic {
uid = uniqueValId("row-" + strconv.Itoa(rowCount))
} else {
uid = getUniqueId(row, m)
}

if elem, found = rsv.elements[uid]; !found {
// unique row mapping found, new object
Expand Down Expand Up @@ -103,7 +110,7 @@ func loadRow(m *Mapper, row []interface{}, rsv *resolver) error {
if cell.IsNull() {
_, nullable := value.NullableTypes[typ]
if !(isDstPtr || nullable) {
return errors.New(fmt.Sprintf("carta: cannot load null value to type %s for column %s", typ, col.name))
return fmt.Errorf("carta: cannot load null value to type %s for column %s", typ, col.name)
}
// no need to set destination if cell is null
} else {
Expand Down Expand Up @@ -216,7 +223,7 @@ func loadRow(m *Mapper, row []interface{}, rsv *resolver) error {
if subMap.isNil(row) {
continue
}
if err = loadRow(subMap, row, elem.subMaps[i]); err != nil {
if err = loadRow(subMap, row, elem.subMaps[i], rowCount); err != nil {
return err
}
}
Expand Down
12 changes: 6 additions & 6 deletions load_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -49,7 +49,7 @@ func TestLoadRow(t *testing.T) {
}

rsv := newResolver()
err = loadRow(m, row, rsv)
err = loadRow(m, row, rsv, 0)
if err != nil {
t.Fatalf("error loading row: %s", err)
}
Expand Down Expand Up @@ -98,7 +98,7 @@ func TestLoadRowNullValue(t *testing.T) {
}

rsv := newResolver()
err = loadRow(m, row, rsv)
err = loadRow(m, row, rsv, 0)
if err != nil {
t.Fatalf("error loading row: %s", err)
}
Expand Down Expand Up @@ -167,7 +167,7 @@ func TestLoadRowDataTypes(t *testing.T) {
}

rsv := newResolver()
err = loadRow(m, row, rsv)
err = loadRow(m, row, rsv, 0)
if err != nil {
t.Fatalf("error loading row: %s", err)
}
Expand Down Expand Up @@ -225,7 +225,7 @@ func TestLoadRowConversionError(t *testing.T) {
}

rsv := newResolver()
err = loadRow(m, row, rsv)
err = loadRow(m, row, rsv, 0)
if err == nil {
t.Fatalf("expected a conversion error, but got nil")
}
Expand Down Expand Up @@ -263,7 +263,7 @@ func TestLoadRowNullToNonNull(t *testing.T) {
}

rsv := newResolver()
err = loadRow(m, row, rsv)
err = loadRow(m, row, rsv, 0)
if err == nil {
t.Fatalf("expected an error when loading a null value to a non-nullable field, but got nil")
}
Expand Down Expand Up @@ -297,7 +297,7 @@ func TestLoadRowNullTypes(t *testing.T) {
}

rsv := newResolver()
err = loadRow(m, row, rsv)
err = loadRow(m, row, rsv, 0)
if err != nil {
t.Fatalf("error loading row: %s", err)
}
Expand Down
63 changes: 63 additions & 0 deletions mapper_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -508,3 +508,66 @@ func TestMapDeeplyNestedNoLabels(t *testing.T) {
t.Errorf("there were unfulfilled expectations: %s", err)
}
}

func TestMapToBasicSlice(t *testing.T) {
db, mock, err := sqlmock.New()
if err != nil {
t.Fatalf("an error '%s' was not expected when opening a stub database connection", err)
}
defer db.Close()

rows := sqlmock.NewRows([]string{"tag"}).
AddRow("tag1").
AddRow("tag2").
AddRow("tag1") // Duplicate tag

mock.ExpectQuery("SELECT (.+) FROM tags").WillReturnRows(rows)

sqlRows, err := db.Query("SELECT * FROM tags")
if err != nil {
t.Fatalf("error '%s' was not expected when querying rows", err)
}

var tags []string
err = Map(sqlRows, &tags)
if err != nil {
t.Errorf("error was not expected while mapping rows: %s", err)
}

if len(tags) != 3 {
t.Fatalf("expected 3 tags, got %d", len(tags))
}

expectedTags := []string{"tag1", "tag2", "tag1"}
if !reflect.DeepEqual(tags, expectedTags) {
t.Errorf("expected tags to be %+v, but got %+v", expectedTags, tags)
}

if err := mock.ExpectationsWereMet(); err != nil {
t.Errorf("there were unfulfilled expectations: %s", err)
}
}

func TestMapToBasicSlice_MultipleColumnsError(t *testing.T) {
db, mock, err := sqlmock.New()
if err != nil {
t.Fatal(err)
}
defer db.Close()

rows := sqlmock.NewRows([]string{"tag", "extra"}).
AddRow("tag1", "x")

mock.ExpectQuery("SELECT (.+) FROM tags").WillReturnRows(rows)

sqlRows, err := db.Query("SELECT * FROM tags")
if err != nil {
t.Fatal(err)
}

var tags []string
err = Map(sqlRows, &tags)
if err == nil {
t.Fatalf("expected error when mapping to []string with multiple columns, got nil")
}
}