1
0
mirror of synced 2025-12-31 15:03:11 -05:00
Files
airbyte/docs/integrations/sources/faker.md
Ella Rohm-Ensing 841942b0e6 Ship a breaking change to faker (#28485)
* bump the metadata version

* add changelog and docker tag

* add breakingChanges metadata

* add migration guide

* resolve qa check failure becuase of missing header

* update changelog
2023-07-19 16:37:18 -06:00

120 lines
8.4 KiB
Markdown

# Faker
## Sync overview
The Sample Data (Faker) source generates sample data using the python [`mimesis`](https://mimesis.name/en/master/) package.
### Output schema
This source will generate an "e-commerce-like" dataset with users, products, and purchases. Here's what is produced at a Postgres destination connected to this source:
```sql
CREATE TABLE "public"."users" (
"address" jsonb,
"occupation" text,
"gender" text,
"academic_degree" text,
"weight" int8,
"created_at" timestamptz,
"language" text,
"telephone" text,
"title" text,
"updated_at" timestamptz,
"nationality" text,
"blood_type" text,
"name" text,
"id" float8,
"age" int8,
"email" text,
"height" text,
-- "_airbyte_ab_id" varchar,
-- "_airbyte_emitted_at" timestamptz,
-- "_airbyte_normalized_at" timestamptz,
-- "_airbyte_users_hashid" text
);
CREATE TABLE "public"."users_address" (
"_airbyte_users_hashid" text,
"country_code" text,
"province" text,
"city" text,
"street_number" text,
"state" text,
"postal_code" text,
"street_name" text,
-- "_airbyte_ab_id" varchar,
-- "_airbyte_emitted_at" timestamptz,
-- "_airbyte_normalized_at" timestamptz,
-- "_airbyte_address_hashid" text
);
CREATE TABLE "public"."products" (
"id" float8,
"make" text,
"year" float8,
"model" text,
"price" float8,
"created_at" timestamptz,
-- "_airbyte_ab_id" varchar,
-- "_airbyte_emitted_at" timestamptz,
-- "_airbyte_normalized_at" timestamptz,
-- "_airbyte_dev_products_hashid" text,
);
CREATE TABLE "public"."purchases" (
"id" float8,
"user_id" float8,
"product_id" float8,
"purchased_at" timestamptz,
"added_to_cart_at" timestamptz,
"returned_at" timestamptz,
-- "_airbyte_ab_id" varchar,
-- "_airbyte_emitted_at" timestamptz,
-- "_airbyte_normalized_at" timestamptz,
-- "_airbyte_dev_purchases_hashid" text,
);
```
### Features
| Feature | Supported?\(Yes/No\) | Notes |
| :---------------- | :------------------- | :---- |
| Full Refresh Sync | Yes | |
| Incremental Sync | Yes | |
| Namespaces | No | |
Of note, if you choose `Incremental Sync`, state will be maintained between syncs, and once you hit `count` records, no new records will be added.
You can choose a specific `seed` (integer) as an option for this connector which will guarantee that the same fake records are generated each time. Otherwise, random data will be created on each subsequent sync.
### Requirements
None!
## Changelog
| Version | Date | Pull Request | Subject |
|:--------|:-----------|:----------------------------------------------------------------------------------------------------------------------|:----------------------------------------------------------------------------------------------------------------|
| 4.0.0 | 2023-07-19 | [28485](https://github.com/airbytehq/airbyte/pull/28485) | Bump to test publication |
| 3.0.2 | 2023-07-07 | [27807](https://github.com/airbytehq/airbyte/pull/28060) | Bump to test publication |
| 3.0.1 | 2023-06-28 | [27807](https://github.com/airbytehq/airbyte/pull/27807) | Fix bug with purchase stream updated_at |
| 3.0.0 | 2023-06-23 | [27684](https://github.com/airbytehq/airbyte/pull/27684) | Stream cursor is now `updated_at` & remove `records_per_sync` option |
| 2.1.0 | 2023-05-08 | [25903](https://github.com/airbytehq/airbyte/pull/25903) | Add user.address (object) |
| 2.0.3 | 2023-02-20 | [23259](https://github.com/airbytehq/airbyte/pull/23259) | bump to test publication |
| 2.0.2 | 2023-02-20 | [23259](https://github.com/airbytehq/airbyte/pull/23259) | bump to test publication |
| 2.0.1 | 2023-01-30 | [22117](https://github.com/airbytehq/airbyte/pull/22117) | `source-faker` goes beta |
| 2.0.0 | 2022-12-14 | [20492](https://github.com/airbytehq/airbyte/pull/20492) and [20741](https://github.com/airbytehq/airbyte/pull/20741) | Decouple stream states for better parallelism |
| 1.0.0 | 2022-11-28 | [19490](https://github.com/airbytehq/airbyte/pull/19490) | Faker uses the CDK; rename streams to be lower-case (breaking), add determinism to random purchases, and rename |
| 0.2.1 | 2022-10-14 | [19197](https://github.com/airbytehq/airbyte/pull/19197) | Emit `AirbyteEstimateTraceMessage` |
| 0.2.0 | 2022-10-14 | [18021](https://github.com/airbytehq/airbyte/pull/18021) | Move to mimesis for speed! |
| 0.1.8 | 2022-10-12 | [17889](https://github.com/airbytehq/airbyte/pull/17889) | Bump to test publish command (2) |
| 0.1.7 | 2022-10-11 | [17848](https://github.com/airbytehq/airbyte/pull/17848) | Bump to test publish command |
| 0.1.6 | 2022-09-07 | [16418](https://github.com/airbytehq/airbyte/pull/16418) | Log start of each stream |
| 0.1.5 | 2022-06-10 | [13695](https://github.com/airbytehq/airbyte/pull/13695) | Emit timestamps in the proper ISO format |
| 0.1.4 | 2022-05-27 | [13298](https://github.com/airbytehq/airbyte/pull/13298) | Test publication flow |
| 0.1.3 | 2022-05-27 | [13248](https://github.com/airbytehq/airbyte/pull/13248) | Add options for records_per_sync and page_size |
| 0.1.2 | 2022-05-26 | [13248](https://github.com/airbytehq/airbyte/pull/13293) | Test publication flow |
| 0.1.1 | 2022-05-26 | [13235](https://github.com/airbytehq/airbyte/pull/13235) | Publish for AMD and ARM (M1 Macs) & remove User.birthdate |
| 0.1.0 | 2022-04-12 | [11738](https://github.com/airbytehq/airbyte/pull/11738) | The Faker Source is created |