Skip to content

API Reference

Complete API documentation for PyMessage functions.

Core Functions

Retrieve iMessage messages from iPhone backup database.

Retrieve iMessage messages from a chat database.

Query messages with optional filtering by phone numbers and date range. Returns a pandas DataFrame with message details, attachments, and reactions.

Parameters:

Name Type Description Default
backup

A Backup object from find_backups(), or a direct path (str or Path) to a chat.db file (e.g. ~/Library/Messages/chat.db on macOS). Also accepts EXAMPLE_BACKUP for testing.

required
phone_numbers str | list[str] | None

Single phone number or list to filter conversations. Accepts various formats: "+1234567890", "(123) 456-7890", "email@example.com"

None
date_range tuple[str | datetime, str | datetime] | None

Tuple of (start, end) dates for filtering. Dates can be: - ISO format strings: "2024-01-01", "2024-12-31" - datetime objects If None, returns all messages.

None
output_csv str | Path | None

Optional path to export results as CSV.

None

Returns:

Type Description
DataFrame

DataFrame with columns:

DataFrame
  • timestamp (pd.Timestamp): Message timestamp in UTC
DataFrame
  • read_at (pd.Timestamp | None): When message was read (None if unread)
DataFrame
  • sender (str): Phone number or email of sender
DataFrame
  • contact_name (str): Display name from handle table, or "Me" for sent messages
DataFrame
  • message_text (str): Text content of message
DataFrame
  • is_from_me (bool): True if sent by device owner
DataFrame
  • chat_id (str): Chat identifier
DataFrame
  • is_group_chat (bool): True if group conversation
DataFrame
  • attachment_path (str | None): Path to attachment file
DataFrame
  • reaction_type (str | None): Type of reaction if this is a tapback
DataFrame
  • reaction_action (str | None): "add" or "remove" for reactions

Raises:

Type Description
ValueError

If date_range has invalid format.

FileNotFoundError

If specified path doesn't exist.

Examples:

>>> from pymessage import find_backups, get_messages
>>> backups = find_backups()
>>> df = get_messages(backups[0])
>>> # Get messages for specific contact
>>> df = get_messages(backups[0], phone_numbers="+1234567890")
>>> # Get messages in date range and export to CSV
>>> df = get_messages(
...     backups[0],
...     date_range=("2024-01-01", "2024-12-31"),
...     output_csv="messages.csv"
... )

List all conversations with summary statistics.

List all conversations with summary statistics.

Returns metadata about each conversation including participant count, message count, and date range.

Parameters:

Name Type Description Default
backup

A Backup object specifying the data source. Use find_backups() to discover available sources, or EXAMPLE_BACKUP for testing.

required
include_empty bool

Include conversations with no messages (default False).

False

Returns:

Type Description
DataFrame

DataFrame with columns:

DataFrame
  • chat_id (str): Chat identifier
DataFrame
  • is_group_chat (bool): True if group conversation
DataFrame
  • participants (list[str]): List of phone numbers/emails
DataFrame
  • participant_count (int): Number of participants
DataFrame
  • message_count (int): Total messages in conversation
DataFrame
  • first_message (pd.Timestamp): Earliest message timestamp
DataFrame
  • last_message (pd.Timestamp): Most recent message timestamp
DataFrame
  • display_name (str | None): Chat display name if available

Raises:

Type Description
FileNotFoundError

If specified path doesn't exist.

Examples:

>>> from pymessage import find_backups, list_conversations
>>> backups = find_backups()
>>> df = list_conversations(backups[0])
>>> # Filter to group chats only
>>> groups = df[df["is_group_chat"] == True]
>>> # Sort by most active
>>> df.sort_values("message_count", ascending=False)

Backup Management

Scan default macOS location for iPhone backups.

Scan for all available iMessage data sources.

Searches ~/Library/Application Support/MobileSync/Backup/ for iPhone backups and checks ~/Library/Messages/chat.db for the macOS Messages database.

Returns:

Type Description
list[Backup]

List of Backup objects sorted by last backup date (most recent first),

list[Backup]

with the macOS entry appended at the end if found.

Examples:

>>> backups = find_backups()
>>> for b in backups:
...     print(b)
[iPhone] Tucker's iPhone (iOS 17.2) — Last backup: 2024-03-01
[macOS] MacBook Messages

Extract metadata from iPhone backup directory.

Extract metadata from iPhone backup directory.

Reads Info.plist and Manifest.plist to extract device information and backup details.

Parameters:

Name Type Description Default
backup_path str | Path

Path to backup directory.

required

Returns:

Type Description
dict[str, Any]

Dictionary with backup metadata:

dict[str, Any]
  • path (Path): Absolute path to backup directory
dict[str, Any]
  • device_name (str): Device name from Info.plist
dict[str, Any]
  • last_backup (datetime): Last backup timestamp
dict[str, Any]
  • ios_version (str): iOS version string
dict[str, Any]
  • phone_number (str | None): Phone number if available
dict[str, Any]
  • serial_number (str): Device serial number

Raises:

Type Description
FileNotFoundError

If backup_path doesn't exist.

ValueError

If Info.plist is missing or malformed.

Examples:

>>> info = get_backup_info("/path/to/backup")
>>> print(info["device_name"])
John's iPhone

Attachments

Retrieve attachment metadata and file paths.

Retrieve attachment metadata and file paths.

Returns information about all attachments in conversations, optionally filtered by phone numbers.

Parameters:

Name Type Description Default
backup

A Backup object specifying the data source. Use find_backups() to discover available sources, or EXAMPLE_BACKUP for testing.

required
phone_numbers str | list[str] | None

Filter to attachments in these conversations.

None

Returns:

Type Description
DataFrame

DataFrame with columns:

DataFrame
  • attachment_id (int): Attachment rowid
DataFrame
  • message_id (int): Associated message rowid
DataFrame
  • filename (str): Original filename
DataFrame
  • mime_type (str): MIME type (e.g., "image/jpeg")
DataFrame
  • file_size (int): Size in bytes
DataFrame
  • file_path (str | None): Resolved path to attachment file
DataFrame
  • timestamp (pd.Timestamp): Message timestamp
DataFrame
  • sender (str): Sender phone/email

Raises:

Type Description
FileNotFoundError

If specified path doesn't exist.

Examples:

>>> from pymessage import find_backups, get_attachments
>>> backups = find_backups()
>>> df = get_attachments(backups[0])
>>> # Filter to images only
>>> images = df[df["mime_type"].str.startswith("image/")]

Resolve attachment filename to actual path in backup.

Resolve attachment filename to actual path in backup.

iPhone backups store files using SHA1 hash of domain and relative path: path = SHA1("MediaDomain-" + relative_path) Structure: backup_root/[first_2_hex]/[full_hash]

Parameters:

Name Type Description Default
filename str

Relative filename from attachment table.

required
backup_root Path

Root directory of backup.

required

Returns:

Type Description
Path | None

Absolute path to attachment file, or None if not found.

Examples:

>>> path = resolve_attachment_path(
...     "Library/SMS/Attachments/ab/12/IMG_1234.jpg",
...     Path("/path/to/backup")
... )
>>> print(path)
/path/to/backup/41/41746ffc65924078eae42725c979305626f57cca

Utility Functions

Convert Apple timestamp to pandas Timestamp.

Convert Apple timestamp to pandas Timestamp.

Apple uses two timestamp formats: - Values >= 1 trillion: nanoseconds since 2001-01-01 - Values < 1 trillion: seconds since 2001-01-01

Zero values are treated as None (no timestamp).

Parameters:

Name Type Description Default
timestamp int | float | None

Apple timestamp value, or None.

required

Returns:

Type Description
Timestamp | None

pandas Timestamp object in UTC, or None if input is None/zero.

Examples:

>>> convert_apple_timestamp(None)
>>> convert_apple_timestamp(0)
>>> # Seconds format (older iOS)
>>> ts = convert_apple_timestamp(629990400)
>>> ts.year
2020
>>> # Nanoseconds format (modern iOS)
>>> ts = convert_apple_timestamp(629990400000000000)
>>> ts.year
2020

Parse reaction/tapback type from associated_message_type.

Parse reaction/tapback type from associated_message_type.

Reactions are encoded as separate messages with specific type codes: - 2000-2007: Tapback added - 3000-3007: Tapback removed

Parameters:

Name Type Description Default
associated_message_type int | None

Type code from message.associated_message_type.

required

Returns:

Type Description
str | None

Tuple of (reaction_type, action) where:

str | None
  • reaction_type: "loved", "liked", "disliked", "laughed", "emphasized", "questioned", or None
tuple[str | None, str | None]
  • action: "add" or "remove" or None

Examples:

>>> parse_reaction_type(2000)
('loved', 'add')
>>> parse_reaction_type(3001)
('liked', 'remove')
>>> parse_reaction_type(2003)
('laughed', 'add')
>>> parse_reaction_type(None)
(None, None)
>>> parse_reaction_type(0)
(None, None)

Normalize phone number to digits-only format.

Normalize phone number to digits-only format.

Strips all non-digit characters except leading '+'. Email addresses (containing '@') are returned as-is.

Parameters:

Name Type Description Default
phone str

Phone number in any format, or email address.

required

Returns:

Type Description
str

Normalized phone number or email address.

Examples:

>>> normalize_phone_number("+1 (234) 567-8900")
'+12345678900'
>>> normalize_phone_number("(234) 567-8900")
'2345678900'
>>> normalize_phone_number("234-567-8900")
'2345678900'
>>> normalize_phone_number("user@example.com")
'user@example.com'

Generate lookup variants for phone number matching.

Generate lookup variants for phone number matching.

Creates multiple representations to match against database, handling variations in how iMessage stores contact identifiers. Special handling for US +1 country code.

Parameters:

Name Type Description Default
phone str

Normalized phone number or email address.

required

Returns:

Type Description
list[str]

List of variants to try for lookup. Email addresses return

list[str]

single-item list.

Examples:

>>> variants = generate_phone_variants("+12345678900")
>>> set(variants) == {"+12345678900", "12345678900", "2345678900"}
True
>>> variants = generate_phone_variants("2345678900")
>>> "+12345678900" in variants
True
>>> "12345678900" in variants
True
>>> generate_phone_variants("user@example.com")
['user@example.com']

Analytics

Summary statistics across all messages.

Compute overall messaging activity statistics.

Parameters:

Name Type Description Default
df DataFrame

DataFrame produced by get_messages().

required
start str | Timestamp | None

Optional start date for filtering (ISO string or Timestamp).

None
end str | Timestamp | None

Optional end date for filtering (ISO string or Timestamp).

None
last_n_days int | None

If provided, overrides start/end and filters to the last N days relative to reference_date.

None
reference_date Timestamp | None

Reference point for last_n_days. Defaults to now (UTC).

None
top_n int

Number of top contacts to include in top_contacts_df.

10

Returns:

Type Description
DataFrame

Tuple of (summary_df, top_contacts_df).

DataFrame

summary_df is a single-row DataFrame with columns:

tuple[DataFrame, DataFrame]
  • total_messages (int)
tuple[DataFrame, DataFrame]
  • total_sent (int)
tuple[DataFrame, DataFrame]
  • total_received (int)
tuple[DataFrame, DataFrame]
  • avg_messages_per_day (float)
tuple[DataFrame, DataFrame]
  • unique_contacts (int)
tuple[DataFrame, DataFrame]
  • most_active_day_of_week (str, e.g. "Saturday")
tuple[DataFrame, DataFrame]
  • most_active_hour (int, 0–23)
tuple[DataFrame, DataFrame]
  • late_night_contacts (list[str])
tuple[DataFrame, DataFrame]
  • pct_messages_with_attachments (float, 0–1)
tuple[DataFrame, DataFrame]
  • avg_message_length (float)
tuple[DataFrame, DataFrame]
  • avg_response_time_seconds (float)
tuple[DataFrame, DataFrame]
  • conversations_initiated (int)
tuple[DataFrame, DataFrame]
  • conversations_received (int)
tuple[DataFrame, DataFrame]
  • ghost_contacts (list[str])
tuple[DataFrame, DataFrame]

top_contacts_df has columns: contact, total, sent, received.

tuple[DataFrame, DataFrame]

Sorted descending by total, limited to top_n rows.

Examples:

>>> from pymessage import EXAMPLE_BACKUP, get_messages, get_activity_summary
>>> df = get_messages(EXAMPLE_BACKUP)
>>> summary, top = get_activity_summary(df)
>>> print(summary["total_messages"].iloc[0])

Per-contact messaging statistics.

Compute per-contact messaging statistics.

Parameters:

Name Type Description Default
df DataFrame

DataFrame produced by get_messages().

required
contact str

Phone number or email to summarize. All format variants are checked (e.g. "+12345678900", "2345678900").

required
start str | Timestamp | None

Optional start date for filtering.

None
end str | Timestamp | None

Optional end date for filtering.

None
last_n_days int | None

If provided, overrides start/end.

None
reference_date Timestamp | None

Reference point for last_n_days. Defaults to now (UTC).

None

Returns:

Type Description
DataFrame

Single-row DataFrame with columns:

DataFrame
  • total_messages (int)
DataFrame
  • total_sent (int)
DataFrame
  • total_received (int)
DataFrame
  • send_receive_ratio (float): total_sent / total_received. Returns float("inf") if no messages have been received from this contact.
DataFrame
  • avg_messages_per_active_day (float)
DataFrame
  • total_active_days (int)
DataFrame
  • avg_read_time_seconds (float)
DataFrame
  • avg_response_time_you_seconds (float)
DataFrame
  • avg_response_time_contact_seconds (float)
DataFrame
  • conversations_initiated_you (int)
DataFrame
  • conversations_initiated_contact (int)
DataFrame
  • longest_gap_days (float)
DataFrame
  • messages_with_attachments (int)
DataFrame
  • avg_message_length_you (float)
DataFrame
  • avg_message_length_contact (float)
DataFrame
  • short_message_count_you (int)
DataFrame
  • short_message_count_contact (int)
DataFrame
  • most_active_hour (int, 0–23)
DataFrame
  • most_active_day_of_week (str)

Examples:

>>> from pymessage import EXAMPLE_BACKUP, get_messages, get_contact_summary
>>> df = get_messages(EXAMPLE_BACKUP)
>>> s = get_contact_summary(df, "+18015550002")
>>> print(s["total_messages"].iloc[0])

Build a 7×24 message-count heatmap for a contact.

Build a 7×24 message-count heatmap for a contact.

Parameters:

Name Type Description Default
df DataFrame

DataFrame produced by get_messages().

required
contact str

Phone number or email to filter on.

required
start str | Timestamp | None

Optional start date for filtering.

None
end str | Timestamp | None

Optional end date for filtering.

None
last_n_days int | None

If provided, overrides start/end.

None
reference_date Timestamp | None

Reference point for last_n_days. Defaults to now (UTC).

None

Returns:

Type Description
DataFrame

7×24 DataFrame where:

DataFrame
  • Index: day-of-week strings Monday through Sunday
DataFrame
  • Columns: integers 0–23 (hours)
DataFrame
  • Values: message counts (int)

Examples:

>>> from pymessage import EXAMPLE_BACKUP, get_messages, get_contact_heatmap
>>> df = get_messages(EXAMPLE_BACKUP)
>>> heatmap = get_contact_heatmap(df, "+18015550003")
>>> print(heatmap.shape)  # (7, 24)