PyGithub Concepts

From NovaOrdis Knowledge Base
Revision as of 21:46, 18 May 2023 by Ovidiu (talk | contribs) (→‎Pagination)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigation Jump to search

Internal

Github

https://pygithub.readthedocs.io/en/latest/github.html

The main class to instantiate to access the GitHub API. It is used to configure authentication, via username and password, personal access token (PAT) or JWT, the base URL, the timeout, the user agent, the Retry strategy, the page size and the pool size.

Base URL

The base_url consists of the host URL and API endpoint:

from github import Github

host_url = 'https://github.com'
api_endpoint = 'api/v3'
base_url = f'{host_url}/{api_endpoint}'
github = Github(base_url=base_url,, ...)

Authentication

Various authentication mechanism are invoked by appropriately configuring the main Github class.

Authentication with Personal Access Token (PAT)

from github import Github

github_pat = os.environ.get('GITHUB_PAT')
if not github_pat:
    raise ValueError("'GITHUB_PAT' not found in environment")
github = Github(base_url='https://github.com/api/v3', login_or_token=github_pat)

Authentication with Username and Password

github = Github(base_url='https://github.com/api/v3', login_or_token='someusername', password='somepassword')

Note that the invocation does not fail if the username or the password is invalid. Subsequent calls are made as unauthenticated user.

Authentication with JWT

Authenticated SSL Client

If the Github server requires the client to authenticate, and the client does not, or uses a self-signed certificate, an invocation fails with:

requests.exceptions.SSLError: HTTPSConnectionPool(host='github.com', port=443): Max retries exceeded with url: /api/v3/user/repos?per_page=100 (Caused by SSLError(SSLCertVerificationError(1, '[SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: self signed certificate in certificate chain (_ssl.c:997)')))

The Retry Strategy

status_forcelist = [403, 500, 502, 504]  # retry 403s, 5XX from GitHub
retry = Retry(total=10, backoff_factor=0.2, raise_on_status=True, status_forcelist=status_forcelist)
github = Github(base_url='...', retry=retry, ...)

Pagination

Responses that potentially may include a large number of elements are returned as PaginatedLists. The page size is controlled by the value of the per_page Github constructor parameter. If not specified the default value is ?

PaginatedList

The total number of elements can be obtained with paginated_list.totalCount.

User

There are AuthenticatedUsers and NamedUsers.

All users from where? entire site? organization? are returned with Github#get_users().

The current user, even if the authentication failed, is returned as an AuthenticatedUser instance, in response to Github#get_user(). However, the instance is lazy instantiated, so the attempt to get a specific attribute will end up in invoking _requester.requestJsonAndCheck(...), which will cause a 403 "API rate limit exceeded [...] Authenticated requests get a higher rate limit."

AuthenticatedUser

https://pygithub.readthedocs.io/en/latest/github_objects/AuthenticatedUser.html
GitHub Concepts | Authenticated User

The invocation on any of the following attributes on an AuthenticatedUser instance corresponding to the "current user" is equivalent with:

curl -L -H "Accept: application/vnd.github+json"  -H "Authorization: Bearer ${GITHUB_PAT}" -H "X-GitHub-Api-Version: 2022-11-28" https://github.com/api/v3/user

To get a specific user:

curl -L -H "Accept: application/vnd.github+json"  -H "Authorization: Bearer ${GITHUB_PAT}" -H "X-GitHub-Api-Version: 2022-11-28" https://github.com/api/v3/users/<user-login>

login

id

name

type

User.

url

updated_at

node_id

organizations_url

repos_url

NamedUser

Repository

https://pygithub.readthedocs.io/en/latest/examples/Repository.html

Repository Attributes

full_name

Get a Repository

github = Github(...)
repo = github.get_repo('some-owner/some-repo')

Issue

https://pygithub.readthedocs.io/en/latest/examples/Issue.html
repo = ...
issues = repo.get_issues(state=state, milestone=milestone)

Milestone

repo = ...
milestones = repo.get_milestones()

Pull Request (PR)

Get Multiple PRs from a Repository

Calls GET /repos/<owner>/<repo>/pulls. See: https://docs.github.com/en/rest/reference/pulls for reference. Returns a paginated list of PullRequest instances.

repository = ...
paginated_list = repository.get_pulls(
  state='open'|'closed'|'all',
  base='...',
  sort='...',
  direction='...'
  per_page=30
)

state

If the state is invalid (not one of 'open', 'closed' or 'all'), the method returns a result corresponding to 'open'. For more details, see:

GitHub Concepts | PR States

base

The base branch name. If no such branch, the method returns an empty list.

sort

A string indicating what to sort results by:

  • 'popularity': sort by the number of comments.
  • 'created': (default)
  • 'updated':
  • 'long-running': will sort by date created and will limit the results to pull requests that have been open for more than a month and have had activity within the past month.

direction

The direction of the sort. Default: 'desc' when sort is created or sort is not specified, otherwise 'asc'.

per_page

Number of results per page, as integer. Default 30, max 100.

page

Page number of the results to fetch, default 1.

Get One PR from a Repository

Calls GET /repos/<owner>/<repo>/pulls/<pr-number>.

See: https://docs.github.com/en/rest/reference/pulls