Introduction
In this tutorial, we're going to learn how to use pyautogui library in Python 3. The PyAutoGUI
library provides cross-platform support for managing mouse and keyboard operations through code to enable automation of tasks. The pyautogui
library is also available for Python 2; however, we will be using Python 3 throughout the course of this tutorial.
A tool like this has many applications, a few of which include taking screen-shots, automating GUI testing (like Selenium), automating tasks that can only be done with a GUI, etc.
Before you go ahead with this tutorial, please note that there are a few prerequisites. You should have a basic understanding of Python's syntax, and/or have done at least beginner level programming in some other language. Other than that, the tutorial is quite simple and easy to follow for beginners.
Installation
The installation process for PyAutoGUI
is fairly simple for all Operating Systems. However, there are a few dependencies for Mac and Linux that need to be installed before the PyAutoGUI
library can be installed and used in programs.
Windows
For Windows, PyAutoGUI
has no dependencies. Simply run the following command in your command prompt and the installation will be done.
$ pip install PyAutoGUI
Mac
For Mac, pyobjc-core
and pyobjc
modules are needed to be installed in sequence first. Below are the commands that you need to run in sequence in your terminal for successful installation:
$ pip3 install pyobjc-core
$ pip3 install pyobjc
$ pip3 install pyautogui
Linux
For Linux, the only dependency is python3-xlib
(for Python 3). To install that, followed by pyautogui
, run the two commands mentioned below in your terminal:
$ pip3 install python3-xlib
$ pip3 install pyautogui
Basic Code Examples
In this section, we are going to cover some of the most commonly used functions from the PyAutoGUI
library.
Generic Functions
The position() Function
Before we can use PyAutoGUI
functions, we need to import it into our program:
import pyautogui as pag
This position()
function tells us the current position of the mouse on our screen:
pag.position()
Output:
Point (x = 643, y = 329)
The onScreen() Function
The onScreen()
function tells us whether the point with coordinates x and y exists on the screen:
print(pag.onScreen(500, 600))
print(pag.onScreen(0, 10000))
Output:
True
False
Here we can see that the first point exists on the screen, but the second point falls beyond the screen's dimensions.
The size() Function
The size()
function finds the height and width (resolution) of a screen.
pag.size()
Output:
Size (width = 1440, height = 900)
Your output may be different and will depend on your screen's size.
Common Mouse Operations
In this section, we are going to cover PyAutoGUI
functions for mouse manipulation, which includes both moving the position of the cursor as well as clicking buttons automatically through code.
The moveTo() Function
The syntax of the moveTo()
function is as follows:
pag.moveTo(x_coordinate, y_coordinate)
The value of x_coordinate
increases from left to right on the screen, and the value of y_coordinate
increases from top to bottom. The value of both x_coordinate
and y_coordinate
at the top left corner of the screen is 0.
Look at the following script:
pag.moveTo(0, 0)
pag.PAUSE = 2
pag.moveTo(100, 500) #
pag.PAUSE = 2
pag.moveTo(500, 500)
In the code above, the main focus is the moveTo()
function that moves the mouse cursor on the screen based on the coordinates we provide as parameters. The first parameter is the x-coordinate and the second parameter is the y-coordinate. It is important to note that these coordinates represent the absolute position of the cursor.
One more thing that has been introduced in the code above is the PAUSE
property; it basically pauses the execution of the script for the given amount of time. The PAUSE
property has been added in the above code so that you can see the function execution; otherwise, the functions would execute in a split second and you won't be able to actually see the cursor moving from one location to the other on the screen.
Another workaround for this would be to indicate the time for each moveTo()
operation as the third parameter in the function, e.g. moveTo(x, y, time_in_seconds)
.
Executing the above script may result in the following error:
Note: Possible Error
Traceback (most recent call last):
File "a.py", line 5, in <module>
pag.moveTo (100, 500)
File "/anaconda3/lib/python3.6/site-packages/pyautogui/__init__.py", line 811, in moveTo
_failSafeCheck()
File "/anaconda3/lib/python3.6/site-packages/pyautogui/__init__.py", line 1241, in _failSafeCheck
raise FailSafeException ('PyAutoGUI fail-safe triggered from mouse moving to a corner of the screen. To disable this fail-safe, set pyautogui.FAILSAFE to False. DISABLING FAIL-SAFE IS NOT RECOMMENDED.')
pyautogui.FailSafeException: PyAutoGUI fail-safe triggered from mouse moving to a corner of the screen. To disable this fail-safe, set pyautogui.FAILSAFE to False. DISABLING FAIL-SAFE IS NOT RECOMMENDED.
If the execution of the moveTo()
function generates an error similar to the one shown above, it means that your computer's fail-safe is enabled. To disable the fail-safe, add the following line at the start of your code:
pag.FAILSAFE = False
This feature is enabled by default so that you can easily stop execution of your pyautogui
program by manually moving the mouse to the upper left corner of the screen. Once the mouse is in this location, pyautogui
will throw an exception and exit.
The moveRel() Function
The coordinates of the moveTo()
function are absolute. However, if you want to move the mouse position relative to the current mouse position, you can use the moveRel()
function.
What this means is that the reference point for this function, when moving the cursor, would not be the top left point on the screen (0, 0), but the current position of the mouse cursor. So, if your mouse cursor is currently at point (100, 100) on the screen and you call the moveRel()
function with the parameters (100, 100, 2)
the new position of your move cursor would be (200, 200).
You can use the moveRel()
function as shown below:
pag.moveRel(100, 100, 2)
The above script will move the cursor 100 points to the right and 100 points down in 2 seconds, with respect to the current cursor position.
The click() Function
The click()
function is used to imitate mouse click operations. The syntax for the click()
function is as follows:
pag.click(x, y, clicks, interval, button)
The parameters are explained as follows:
x
: the x-coordinate of the point to reachy
: the y-coordinate of the point to reachclicks
: the number of clicks that you would like to do when the cursor gets to that point on screeninterval
: the amount of time in seconds between each mouse click i.e. if you are doing multiple mouse clicksbutton
: specify which button on the mouse you would like to press when the cursor gets to that point on screen. The possible values areright
,left
, andmiddle
.
Here is an example:
pag.click(100, 100, 5, 2, 'right')
Check out our hands-on, practical guide to learning Git, with best-practices, industry-accepted standards, and included cheat sheet. Stop Googling Git commands and actually learn it!
You can also execute specific click functions as follows:
pag.rightClick(x, y)
pag.doubleClick(x, y)
pag.tripleClick(x, y)
pag.middleClick(x, y)
Here the x
and y
represent the x
and y
coordinates, just like in the previous functions.
You can also have more fine-grained control over mouse clicks by specifying when to press the mouse down, and when to release it up. This is done using the mouseDown
and mouseUp
functions, respectively.
Here is a short example:
pag.mouseDown(x=x, y=y, button='left')
pag.mouseUp(x=x, y=y, button='left')
The above code is equivalent to just doing a pag.click(x, y)
call.
The scroll() Function
The last mouse function we are going to cover is scroll
. As expected, it has two options: scroll up and scroll down. The syntax for the scroll()
function is as follows:
pag.scroll(amount_to_scroll, x=x_movement, y=y_movement)
To scroll up, specify a positive value for amount_to_scroll
parameter, and to scroll down, specify a negative value. Here is an example:
pag.scroll(100, 120, 120)
Alright, this was it for the mouse functions. By now, you should be able to control your mouse's buttons as well as movements through code. Let's now move to keyboard functions. There are plenty, but we will cover only those that are most frequently used.
Common Keyboard Operations
Before we move to the functions, it is important that we know which keys can be pressed through code in pyautogui
, as well as their exact naming convention. To do so, run the following script:
print(pag.KEYBOARD_KEYS)
Output:
['\t', '\n', '\r', ' ', '!', '"', '#', '$', '%', '&', "'", '(', ')', '*', '+', ',', '-', '.', '/', '0', '1', '2', '3', '4', '5', '6', '7', '8', '9', ':', ';', '<', '=', '>', '?', '@', '[', '\\', ']', '^', '_', '`', 'a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j', 'k', 'l', 'm', 'n', 'o', 'p', 'q', 'r', 's', 't', 'u', 'v', 'w', 'x', 'y', 'z', '{', '|', '}', '~', 'accept', 'add', 'alt', 'altleft', 'altright', 'apps', 'backspace', 'browserback', 'browserfavorites', 'browserforward', 'browserhome', 'browserrefresh', 'browsersearch', 'browserstop', 'capslock', 'clear', 'convert', 'ctrl', 'ctrlleft', 'ctrlright', 'decimal', 'del', 'delete', 'divide', 'down', 'end', 'enter', 'esc', 'escape', 'execute', 'f1', 'f10', 'f11', 'f12', 'f13', 'f14', 'f15', 'f16', 'f17', 'f18', 'f19', 'f2', 'f20', 'f21', 'f22', 'f23', 'f24', 'f3', 'f4', 'f5', 'f6', 'f7', 'f8', 'f9', 'final', 'fn', 'hanguel', 'hangul', 'hanja', 'help', 'home', 'insert', 'junja', 'kana', 'kanji', 'launchapp1', 'launchapp2', 'launchmail', 'launchmediaselect', 'left', 'modechange', 'multiply', 'nexttrack', 'nonconvert', 'num0', 'num1', 'num2', 'num3', 'num4', 'num5', 'num6', 'num7', 'num8', 'num9', 'numlock', 'pagedown', 'pageup', 'pause', 'pgdn', 'pgup', 'playpause', 'prevtrack', 'print', 'printscreen', 'prntscrn', 'prtsc', 'prtscr', 'return', 'right', 'scrolllock', 'select', 'separator', 'shift', 'shiftleft', 'shiftright', 'sleep', 'space', 'stop', 'subtract', 'tab', 'up', 'volumedown', 'volumemute', 'volumeup', 'win', 'winleft', 'winright', 'yen', 'command', 'option', 'optionleft', 'optionright']
The typewrite()
Function
The typewrite()
function is used to type something in a text field. Syntax for the function is as follows:
pag.typewrite(text, interval)
Here text
is what you wish to type in the field and interval
is time in seconds between each keystroke. Here is an example:
pag.typewrite('Junaid Khalid', 1)
Executing the script above will enter the text Junaid Khalid
in the field that is currently selected with a pause of 1 second between each key press.
Another way this function can be used is by passing in a list of keys that you'd like to press in a sequence. To do that through code, see the example below:
pag.typewrite(['j', 'u', 'n', 'a', 'i', 'd', 'e', 'backspace', 'enter'])
In the above example, the text junaide
would be entered, followed by the removal of the trailing e
. The input in the text field will be submitted by pressing the Enter
key.
The hotkey()
Function
If you haven't noticed this so far, the keys we've shown above have no mention for combined operations like Control + C
for the copy command. In case you're thinking you could do that by passing the list ['ctrl', 'c']
to the typewrite()
function, you are wrong. The typewrite()
function would press both those buttons in a sequence, not simultaneously. And as you probably already know, to execute the copy command, you need to press the C
key while holding the ctrl
key.
To press two or more keys simultaneously, you can use the hotkey()
function, as shown here:
pag.hotkey('shift', 'enter')
pag.hotkey('ctrl', '2' ) # For the @ symbol
pag.hotkey('ctrl', 'c') # For the copy command
The screenshot()
Function
If you would like to take a screen-shot of the screen at any instance, the screenshot()
function is the one you are looking for. Let's see how we can implement that using PyAutoGUI
:
scree_shot = pag.screenshot() # to store a PIL object containing the image in a variable
This will store a PIL object containing the image in a variable.
If, however, you want to store the screen-shot directly to your computer, you can call the screenshot
function like this instead:
pag.screenshot('ss.png')
This will save the screen-shot in a file, with the filename given, on your computer.
The confirm(), alert(), and prompt() Functions
The last set of functions that we are going to cover in this tutorial are the message box functions. Here is a list of the message box functions available in PyAutoGUI
:
- Confirmation Box: Displays information and gives you two options i.e.
OK
andCancel
- Alert Box: Displays some information and to acknowledge that you have read it. It displays a single button i.e.
OK
- Prompt Box: Requests some information from the user, and upon entering, the user has to click the
OK
button
Now that we have seen the types, let's see how we can display these buttons on the screen in the same sequence as above:
pag.confirm("Are you ready?")
pag.alert("The program has crashed!")
pag.prompt("Please enter your name: ")
In the output, you will see the following sequence of message boxes.
Confirm:
Alert:
Prompt:
Conclusion
In this tutorial, we learned how to use the PyAutoGUI
automation library in Python. We started off by talking about prerequisites for this tutorial, its installation process for different operating systems, followed by learning about some of its general functions. After that we studied the functions specific to mouse movements, mouse control, and keyboard control.
After following this tutorial, you should be able to use PyAutoGUI
to automate GUI operations for repetitive tasks in your own application.