API Reference

Complete API documentation for PyMessage functions.

Core Functions

Retrieve iMessage messages from iPhone backup database.

Retrieve iMessage messages from a chat database.

Query messages with optional filtering by phone numbers and date range. Returns a pandas DataFrame with message details, attachments, and reactions.

Parameters:

Name	Type	Description	Default
`backup`		A Backup object from find_backups(), or a direct path (str or Path) to a chat.db file (e.g. ~/Library/Messages/chat.db on macOS). Also accepts EXAMPLE_BACKUP for testing.	required
`phone_numbers`	`str \| list[str] \| None`	Single phone number or list to filter conversations. Accepts various formats: "+1234567890", "(123) 456-7890", "email@example.com"	`None`
`date_range`	`tuple[str \| datetime, str \| datetime] \| None`	Tuple of (start, end) dates for filtering. Dates can be: - ISO format strings: "2024-01-01", "2024-12-31" - datetime objects If None, returns all messages.	`None`
`output_csv`	`str \| Path \| None`	Optional path to export results as CSV.	`None`

Returns:

Type	Description
`DataFrame`	DataFrame with columns:
`DataFrame`	timestamp (pd.Timestamp): Message timestamp in UTC
`DataFrame`	read_at (pd.Timestamp \| None): When message was read (None if unread)
`DataFrame`	sender (str): Phone number or email of sender
`DataFrame`	contact_name (str): Display name from handle table, or "Me" for sent messages
`DataFrame`	message_text (str): Text content of message
`DataFrame`	is_from_me (bool): True if sent by device owner
`DataFrame`	chat_id (str): Chat identifier
`DataFrame`	is_group_chat (bool): True if group conversation
`DataFrame`	attachment_path (str \| None): Path to attachment file
`DataFrame`	reaction_type (str \| None): Type of reaction if this is a tapback
`DataFrame`	reaction_action (str \| None): "add" or "remove" for reactions

Raises:

Type	Description
`ValueError`	If date_range has invalid format.
`FileNotFoundError`	If specified path doesn't exist.

Examples:

>>> from pymessage import find_backups, get_messages
>>> backups = find_backups()
>>> df = get_messages(backups[0])

>>> # Get messages for specific contact
>>> df = get_messages(backups[0], phone_numbers="+1234567890")

>>> # Get messages in date range and export to CSV
>>> df = get_messages(
...     backups[0],
...     date_range=("2024-01-01", "2024-12-31"),
...     output_csv="messages.csv"
... )

List all conversations with summary statistics.

Returns metadata about each conversation including participant count, message count, and date range.

Parameters:

Name	Type	Description	Default
`backup`		A Backup object specifying the data source. Use find_backups() to discover available sources, or EXAMPLE_BACKUP for testing.	required
`include_empty`	`bool`	Include conversations with no messages (default False).	`False`

Returns:

Type	Description
`DataFrame`	DataFrame with columns:
`DataFrame`	chat_id (str): Chat identifier
`DataFrame`	is_group_chat (bool): True if group conversation
`DataFrame`	participants (list[str]): List of phone numbers/emails
`DataFrame`	participant_count (int): Number of participants
`DataFrame`	message_count (int): Total messages in conversation
`DataFrame`	first_message (pd.Timestamp): Earliest message timestamp
`DataFrame`	last_message (pd.Timestamp): Most recent message timestamp
`DataFrame`	display_name (str \| None): Chat display name if available

Raises:

Type	Description
`FileNotFoundError`	If specified path doesn't exist.

Examples:

>>> from pymessage import find_backups, list_conversations
>>> backups = find_backups()
>>> df = list_conversations(backups[0])
>>> # Filter to group chats only
>>> groups = df[df["is_group_chat"] == True]
>>> # Sort by most active
>>> df.sort_values("message_count", ascending=False)

Backup Management

Scan default macOS location for iPhone backups.

Scan for all available iMessage data sources.

Searches ~/Library/Application Support/MobileSync/Backup/ for iPhone backups and checks ~/Library/Messages/chat.db for the macOS Messages database.

Returns:

Type	Description
`list[Backup]`	List of Backup objects sorted by last backup date (most recent first),
`list[Backup]`	with the macOS entry appended at the end if found.

Examples:

>>> backups = find_backups()
>>> for b in backups:
...     print(b)
[iPhone] Tucker's iPhone (iOS 17.2) — Last backup: 2024-03-01
[macOS] MacBook Messages

Extract metadata from iPhone backup directory.

Reads Info.plist and Manifest.plist to extract device information and backup details.

Parameters:

Name	Type	Description	Default
`backup_path`	`str \| Path`	Path to backup directory.	required

Returns:

Type	Description
`dict[str, Any]`	Dictionary with backup metadata:
`dict[str, Any]`	path (Path): Absolute path to backup directory
`dict[str, Any]`	device_name (str): Device name from Info.plist
`dict[str, Any]`	last_backup (datetime): Last backup timestamp
`dict[str, Any]`	ios_version (str): iOS version string
`dict[str, Any]`	phone_number (str \| None): Phone number if available
`dict[str, Any]`	serial_number (str): Device serial number

Raises:

Type	Description
`FileNotFoundError`	If backup_path doesn't exist.
`ValueError`	If Info.plist is missing or malformed.

Examples:

>>> info = get_backup_info("/path/to/backup")
>>> print(info["device_name"])
John's iPhone

Attachments

Retrieve attachment metadata and file paths.

Returns information about all attachments in conversations, optionally filtered by phone numbers.

Parameters:

Name	Type	Description	Default
`backup`		A Backup object specifying the data source. Use find_backups() to discover available sources, or EXAMPLE_BACKUP for testing.	required
`phone_numbers`	`str \| list[str] \| None`	Filter to attachments in these conversations.	`None`

Returns:

Type	Description
`DataFrame`	DataFrame with columns:
`DataFrame`	attachment_id (int): Attachment rowid
`DataFrame`	message_id (int): Associated message rowid
`DataFrame`	filename (str): Original filename
`DataFrame`	mime_type (str): MIME type (e.g., "image/jpeg")
`DataFrame`	file_size (int): Size in bytes
`DataFrame`	file_path (str \| None): Resolved path to attachment file
`DataFrame`	timestamp (pd.Timestamp): Message timestamp
`DataFrame`	sender (str): Sender phone/email

Raises:

Type	Description
`FileNotFoundError`	If specified path doesn't exist.

Examples:

>>> from pymessage import find_backups, get_attachments
>>> backups = find_backups()
>>> df = get_attachments(backups[0])
>>> # Filter to images only
>>> images = df[df["mime_type"].str.startswith("image/")]

Resolve attachment filename to actual path in backup.

iPhone backups store files using SHA1 hash of domain and relative path: path = SHA1("MediaDomain-" + relative_path) Structure: backup_root/[first_2_hex]/[full_hash]

Parameters:

Name	Type	Description	Default
`filename`	`str`	Relative filename from attachment table.	required
`backup_root`	`Path`	Root directory of backup.	required

Returns:

Type	Description
`Path \| None`	Absolute path to attachment file, or None if not found.

Examples:

>>> path = resolve_attachment_path(
...     "Library/SMS/Attachments/ab/12/IMG_1234.jpg",
...     Path("/path/to/backup")
... )
>>> print(path)
/path/to/backup/41/41746ffc65924078eae42725c979305626f57cca

Utility Functions

Convert Apple timestamp to pandas Timestamp.

Apple uses two timestamp formats: - Values >= 1 trillion: nanoseconds since 2001-01-01 - Values < 1 trillion: seconds since 2001-01-01

Zero values are treated as None (no timestamp).

Parameters:

Name	Type	Description	Default
`timestamp`	`int \| float \| None`	Apple timestamp value, or None.	required

Returns:

Type	Description
`Timestamp \| None`	pandas Timestamp object in UTC, or None if input is None/zero.

Examples:

>>> convert_apple_timestamp(None)

>>> convert_apple_timestamp(0)

>>> # Seconds format (older iOS)
>>> ts = convert_apple_timestamp(629990400)
>>> ts.year
2020
>>> # Nanoseconds format (modern iOS)
>>> ts = convert_apple_timestamp(629990400000000000)
>>> ts.year
2020

Parse reaction/tapback type from associated_message_type.

Reactions are encoded as separate messages with specific type codes: - 2000-2007: Tapback added - 3000-3007: Tapback removed

Parameters:

Name	Type	Description	Default
`associated_message_type`	`int \| None`	Type code from message.associated_message_type.	required

Returns:

Type	Description
`str \| None`	Tuple of (reaction_type, action) where:
`str \| None`	reaction_type: "loved", "liked", "disliked", "laughed", "emphasized", "questioned", or None
`tuple[str \| None, str \| None]`	action: "add" or "remove" or None

Examples:

>>> parse_reaction_type(2000)
('loved', 'add')
>>> parse_reaction_type(3001)
('liked', 'remove')
>>> parse_reaction_type(2003)
('laughed', 'add')
>>> parse_reaction_type(None)
(None, None)
>>> parse_reaction_type(0)
(None, None)

Normalize phone number to digits-only format.

Strips all non-digit characters except leading '+'. Email addresses (containing '@') are returned as-is.

Parameters:

Name	Type	Description	Default
`phone`	`str`	Phone number in any format, or email address.	required

Returns:

Type	Description
`str`	Normalized phone number or email address.

Examples:

>>> normalize_phone_number("+1 (234) 567-8900")
'+12345678900'
>>> normalize_phone_number("(234) 567-8900")
'2345678900'
>>> normalize_phone_number("234-567-8900")
'2345678900'
>>> normalize_phone_number("user@example.com")
'user@example.com'

Generate lookup variants for phone number matching.

Creates multiple representations to match against database, handling variations in how iMessage stores contact identifiers. Special handling for US +1 country code.

Parameters:

Name	Type	Description	Default
`phone`	`str`	Normalized phone number or email address.	required

Returns:

Type	Description
`list[str]`	List of variants to try for lookup. Email addresses return
`list[str]`	single-item list.

Examples:

>>> variants = generate_phone_variants("+12345678900")
>>> set(variants) == {"+12345678900", "12345678900", "2345678900"}
True
>>> variants = generate_phone_variants("2345678900")
>>> "+12345678900" in variants
True
>>> "12345678900" in variants
True
>>> generate_phone_variants("user@example.com")
['user@example.com']

Analytics

Summary statistics across all messages.

Compute overall messaging activity statistics.

Parameters:

Name	Type	Description	Default
`df`	`DataFrame`	DataFrame produced by get_messages().	required
`start`	`str \| Timestamp \| None`	Optional start date for filtering (ISO string or Timestamp).	`None`
`end`	`str \| Timestamp \| None`	Optional end date for filtering (ISO string or Timestamp).	`None`
`last_n_days`	`int \| None`	If provided, overrides start/end and filters to the last N days relative to reference_date.	`None`
`reference_date`	`Timestamp \| None`	Reference point for last_n_days. Defaults to now (UTC).	`None`
`top_n`	`int`	Number of top contacts to include in top_contacts_df.	`10`

Returns:

Type	Description
`DataFrame`	Tuple of (summary_df, top_contacts_df).
`DataFrame`	summary_df is a single-row DataFrame with columns:
`tuple[DataFrame, DataFrame]`	total_messages (int)
`tuple[DataFrame, DataFrame]`	total_sent (int)
`tuple[DataFrame, DataFrame]`	total_received (int)
`tuple[DataFrame, DataFrame]`	avg_messages_per_day (float)
`tuple[DataFrame, DataFrame]`	unique_contacts (int)
`tuple[DataFrame, DataFrame]`	most_active_day_of_week (str, e.g. "Saturday")
`tuple[DataFrame, DataFrame]`	most_active_hour (int, 0–23)
`tuple[DataFrame, DataFrame]`	late_night_contacts (list[str])
`tuple[DataFrame, DataFrame]`	pct_messages_with_attachments (float, 0–1)
`tuple[DataFrame, DataFrame]`	avg_message_length (float)
`tuple[DataFrame, DataFrame]`	avg_response_time_seconds (float)
`tuple[DataFrame, DataFrame]`	conversations_initiated (int)
`tuple[DataFrame, DataFrame]`	conversations_received (int)
`tuple[DataFrame, DataFrame]`	ghost_contacts (list[str])
`tuple[DataFrame, DataFrame]`	top_contacts_df has columns: contact, total, sent, received.
`tuple[DataFrame, DataFrame]`	Sorted descending by total, limited to top_n rows.

Examples:

>>> from pymessage import EXAMPLE_BACKUP, get_messages, get_activity_summary
>>> df = get_messages(EXAMPLE_BACKUP)
>>> summary, top = get_activity_summary(df)
>>> print(summary["total_messages"].iloc[0])

Per-contact messaging statistics.

Compute per-contact messaging statistics.

Parameters:

Name	Type	Description	Default
`df`	`DataFrame`	DataFrame produced by get_messages().	required
`contact`	`str`	Phone number or email to summarize. All format variants are checked (e.g. "+12345678900", "2345678900").	required
`start`	`str \| Timestamp \| None`	Optional start date for filtering.	`None`
`end`	`str \| Timestamp \| None`	Optional end date for filtering.	`None`
`last_n_days`	`int \| None`	If provided, overrides start/end.	`None`
`reference_date`	`Timestamp \| None`	Reference point for last_n_days. Defaults to now (UTC).	`None`

Returns:

Type	Description
`DataFrame`	Single-row DataFrame with columns:
`DataFrame`	total_messages (int)
`DataFrame`	total_sent (int)
`DataFrame`	total_received (int)
`DataFrame`	send_receive_ratio (float): total_sent / total_received. Returns float("inf") if no messages have been received from this contact.
`DataFrame`	avg_messages_per_active_day (float)
`DataFrame`	total_active_days (int)
`DataFrame`	avg_read_time_seconds (float)
`DataFrame`	avg_response_time_you_seconds (float)
`DataFrame`	avg_response_time_contact_seconds (float)
`DataFrame`	conversations_initiated_you (int)
`DataFrame`	conversations_initiated_contact (int)
`DataFrame`	longest_gap_days (float)
`DataFrame`	messages_with_attachments (int)
`DataFrame`	avg_message_length_you (float)
`DataFrame`	avg_message_length_contact (float)
`DataFrame`	short_message_count_you (int)
`DataFrame`	short_message_count_contact (int)
`DataFrame`	most_active_hour (int, 0–23)
`DataFrame`	most_active_day_of_week (str)

Examples:

>>> from pymessage import EXAMPLE_BACKUP, get_messages, get_contact_summary
>>> df = get_messages(EXAMPLE_BACKUP)
>>> s = get_contact_summary(df, "+18015550002")
>>> print(s["total_messages"].iloc[0])

Build a 7×24 message-count heatmap for a contact.

Parameters:

Name	Type	Description	Default
`df`	`DataFrame`	DataFrame produced by get_messages().	required
`contact`	`str`	Phone number or email to filter on.	required
`start`	`str \| Timestamp \| None`	Optional start date for filtering.	`None`
`end`	`str \| Timestamp \| None`	Optional end date for filtering.	`None`
`last_n_days`	`int \| None`	If provided, overrides start/end.	`None`
`reference_date`	`Timestamp \| None`	Reference point for last_n_days. Defaults to now (UTC).	`None`

Returns:

Type	Description
`DataFrame`	7×24 DataFrame where:
`DataFrame`	Index: day-of-week strings Monday through Sunday
`DataFrame`	Columns: integers 0–23 (hours)
`DataFrame`	Values: message counts (int)

Examples:

>>> from pymessage import EXAMPLE_BACKUP, get_messages, get_contact_heatmap
>>> df = get_messages(EXAMPLE_BACKUP)
>>> heatmap = get_contact_heatmap(df, "+18015550003")
>>> print(heatmap.shape)  # (7, 24)