Intro to Appium Clients

For all the reasons discussed in the main intro, Appium is based on the W3C WebDriver specification. This means that Appium implements a client-server architecture. The server (consisting of Appium itself along with any drivers or plugins you are using for automation) is connected to the devices under test, and is actually responsible for making automation happen on those devices. The client (driven by you, the Appium test author) is responsible for sending commands to the server over the network, and receiving responses from the server as a result. These responses can be used to tell whether automation commands are successful, or might contain information that you queried about the state of the application. This document is a conceptual introduction to the client side of this equation.

Info

For more about the server side of the equation (i.e., how does Appium actually control devices?), check out our Intro to Appium Drivers. To skip to a list of links to Appium client libraries, check out the Ecosystem documentation.

What sorts of automation commands are available? That is up to the particular driver and plugins that you are using in any given session. A standard set of commands would include, for example, the following:

  • Find Element
  • Click Element
  • Get Page Source
  • Take Screenshot

If you look at these commands in the WebDriver specification, you'll notice that they are not defined in terms of any particular programming language. They are not Java commands, or JavaScript commands, or Python commands. Instead, they form part of an HTTP API which can be accessed from within any programming language (or none! you could just use cURL if you want).

So, for example, the Find Element command corresponds to an HTTP POST request sent to the HTTP endpoint /session/:sessionid/element (where in this case, :sessionid is a placeholder for the unique session ID generated by the server in a previous call to Create Session).

This information is primarily useful for people developing technology that works with the WebDriver spec. It's not particularly useful for people trying to write Appium or Selenium tests. When you write an Appium test, you want to use a programming language you're familiar with. Luckily, there exist a set of Appium client libraries1 that take care of the responsibility of speaking HTTP to the Appium server. Instead, they expose a set of "native" commands for a particular programming language, so that, to the test author, it just feels like you're writing Python, or JavaScript, or Java.

As an example, here's the same simple set of Appium commands in five different programming languages, using the recommended Appium client binding for each language (note that this is not working sample code including all appropriate imports; see each client library's instructions for setup and command reference):

const element = await driver.$('//*[@text="Foo"]');
await element.click();
console.log(await element.getText())
console.log(await driver.getPageSource())
WebElement element = driver.findElement(By.Xpath("//*[@text='Foo']"))
element.click()
System.out.println(element.getText())
System.out.println(driver.getPageSource())
element = driver.find_element(by=By.XPATH, value='//*[@text="Foo"]')
element.click()
print(element.text)
print(driver.page_source)
element = driver.find_element :xpath, '//*[@text="Foo"]'
element.click
puts element.text
puts driver.page_source
AppiumElement element = driver.FindElement(MobileBy.AccessibilityId("Views"));   
element.click();
System.Console.WriteLine(element.Text);
System.Console.WriteLine(driver.PageSource);

Each of these scripts, despite being in different languages, does the same thing under the hood:

  1. Call Find Element with a using parameter of xpath and a value parameter expressing the XPath query used to find an element. (If you're confused about these terms, you might find an introduction to Appium or Selenium useful)
  2. Call Click Element with the ID of the element found in the previous call.
  3. Call Get Element Text with the ID of the same element, and print it to the console.
  4. Call Get Page Source to retrieve the page/app source and print it to the console.

The only other thing to keep in mind before choosing or using a client is that each client is independently maintained. Just because a feature is available in one client, it doesn't mean it's available in another client (though all clients support at least the standard W3C protocol plus any common appium extensions). Just because one client has a nice set of helper functions, doesn't mean another will. Some clients are kept very frequently up to date, and others are not! So when thinking about choosing a library, the first consideration is the language you want to use, and the second consideration is how full-featured and well-maintained the library is!

To learn how to use an Appium client, visit that client's homepage to learn more. In many cases, the Appium client for a given language is built on top of the Selenium client for that language, and so certain Appium clients may only document the features which the Appium client added on top of the Selenium client. All that to say, for a full reference, you may need to visit both the Appium client documentation as well as the Selenium client documentation.

That's all you need to know about Appium clients! Head over to the Ecosystem page to check out the current list of clients.


  1. These libraries are alternately called "clients", "client libraries", or "client bindings". They all mean the same thing!