Files
FennFlow wraps raw bytes in typed content models instead of passing bytes and dict directly. Every file you put into
or get from storage is represented as one of these models.
All binary content types inherit from BinaryContent, which itself inherits from BaseContent (a Pydantic model).
Content types
| Class | Media type | Notes |
|---|---|---|
BinaryContent |
any | Base class. Use when no specific type fits |
TextContent |
text/plain |
Stores text as UTF-8 bytes internally. .content returns str |
JsonContent |
application/json |
Stores JSON as UTF-8 bytes. .content returns the parsed Python object |
ImageContent |
image/* |
Extends BinaryContent with optional width and height fields |
AudioContent |
audio/* |
Extends BinaryContent with optional duration field |
VideoContent |
video/* |
Extends BinaryContent with optional duration, width, height fields |
DocumentContent |
any document type | Thin subclass of BinaryContent |
There is also a model representing url files:
| Class | Media type | Notes |
|---|---|---|
UrlContent |
any | Stores a URL string instead of bytes. data is str |
Creating content
TextContent and JsonContent expose a from_content() classmethod as the primary constructor:
from fennflow.files import TextContent, JsonContent, BinaryContent, ImageContent, ContentFactory
text = TextContent.from_content("Hello, world!")
json_file = JsonContent.from_content({"key": "value"})
json_list = JsonContent.from_content([1, 2, 3])
# BinaryContent requires explicit media_type
binary = BinaryContent(data=b"...", media_type="application/octet-stream")
# Optional metadata fields
image = ImageContent(data=img_bytes, media_type="image/png", width=800, height=600)
# ContentFactory can be used to get specific class of content
file: TextContent = ContentFactory.from_bytes(
media_type="text/plain",
data=file_bytes,
**metadata,
),
Filename and media type resolution
Both filename and media_type are optional at construction time. FennFlow resolves them:
- If only
media_typeis given: filename is generated as the SHA-256 hash of the data, with the extension guessed from the media type. - If only
filenameis given: media type is guessed from the extension viamimetypes. - If both are omitted: raises
FileNameAndMediaTypeBothNoneException. - If the filename has no extension: the extension is guessed from the media type and appended.
A warning is logged if the file extension and media type do not agree.
Extra metadata
Any keyword arguments not matching declared fields are collected into extra_metadata: dict[str, str]. This metadata is
forwarded to the connector (e.g. stored as S3 object metadata).
ContentFactory
When FennFlow retrieves a file from storage, it reconstructs the appropriate content type using
ContentFactory.from_bytes(). The factory resolves the class from a registry by MIME type match falling back to
BinaryContent for unknown types.
You can register custom content types in content_registry to have them returned automatically on get.