GitHub Webhooks¶
Mozilla collects and republishes GitHub Webhooks for a number of Mozilla’s organizations and projects.
Overall Architecture¶
GitHub Webhooks are configured at the organizational or project level
to publish application/json
payloads to
https://3abyt2fapj.execute-api.us-west-2.amazonaws.com/prod/webhook.
These HTTP requests are delivered to an Amazon API Gateway service operated by the Developer Productivity team. Each webhook request invokes an AWS Lambda function which does the following:
- Publishes the record to an AWS Kinesis Firehose
- Publishes the record to an all AWS SNS topic and optionally a a public AWS SNS topic if the event is non-private.
Data published to the Kinesis Firehose is flushed to Amazon S3 for long-term storage and to facilitate analytics.
Additional AWS Lambda functions consume the public SNS topic and republish events to other channels, such as Pulse.
Private Events¶
While the ingestion server often receives all events for an organization or repository, not all events are republished in public channels.
The following events are excluded from the public:
- Any event belonging to a private repository
- Team membership changes (
membership
andteam_add
events) - Transition of repository from private to public (
public
event) - Repository creation, deletion, or public/private transitions (
repository
event) - Any new events GitHub adds that aren’t in a list of allowed events
Pulse Notifications¶
Pulse is a RabbitMQ exchange operated by Mozilla that serves as a nexus of event publishing for various systems.
GitHub Webhook events are republished to the exchange/github-webhooks/v1 exchange.
The routing key for each message is of the form <repository>/<event>
where <repository>
is the GitHub account/organization + repository
and <event>
is the GitHub event name. e.g. mozilla/gecko-dev/push
or servo/servo/issues
.
The JSON message published to pulse has the following relevant keys:
event
- GitHub event name. e.g.
push
,issues
, orstatus
. request_id
- UUID uniquely identifying this message. The ID is generated by GitHub.
payload
- The payload of the GitHub event. The formats are documented at https://developer.github.com/v3/activity/events/types/.
Delivery of GitHub events to Pulse is best effort. If Pulse is down, data may fail to publish.
SNS Topic¶
Non-private GitHub events are published to the
arn:aws:sns:us-west-2:699292812394:github-webhooks-public
AWS SNS topic.
Kinesis Firehose and S3 Access¶
Access to the streaming GitHub data in Kinesis Firehose and the historical
data retained in S3 can be granted on a per-case basis. If interested,
email developer-services@mozilla.org
.