YAWL 4.2.x

Selenium WebDriver 3.1x

(It should be compatible with newer versions)

Attached files:

  1. The Codelet which contains the RPA script called “FirstTutorialsCodelet.jar”
  2. The GeckoConfig.xml which contains the path to the WebDriver, packed within “.GeckoConfigYAWL”-directory
  3. The YAWL-net “Tutorial1_SearchForData
  4. The GeckoDriver itself (can be downloaded on projects page)
  5. The Codelets source code
  6. The YAWL organizational data "YAWLOrgDataExport"

 

1 Motivation

Robotic Process Automation describes different technologies that operates on graphical user interfaces and imitate human interaction with such interfaces. There are plenty of approaches how to use RPA in an industrial context. Besides the big market leaders like UiPath or BluePrism there are some free and open source RPA tools like Argos Labs or Automation Anywhere. But as good as these tools are for their main purpose, most of them lack in their ability to integrate the opportunities given by Robotic Process Automation into an industrial established workflow automation system.

With that in mind I would like to present you an approach to automate web-based Business Processes through a combination of Workflow Management and Robotic Process Automation. For this approach I will use the profound researched open source Workflow Management System YAWL and Selenium WebDriver, which represents an open source framework for different programming languages to automate the interaction with a web browser.

In the following weeks I will present you a partwise tutorial on how to set up the right environment for reproducing this approach, as well as how to create own workflows and RPA scripts.  In today’s Part I will demonstrate you how to use Selenium WebDriver in the frame of YAWL to search for a plain text in Google.

 

2 Basics and set up

As you are already reading this text in a YAWL specific user group, I assume you are familiar with the basics of workflow automation and YAWL. If not, I would advise you to visit the YAWL Foundations website http://www.yawlfoundation.org and especially to read the YAWL user manual http://www.yawlfoundation.org/pages/support/manuals.html, which gives a very good overview about YAWL and the possibilities for a company to automate their Business Processes with this Workflow Automation System.

2.1 Codelet

If you are already familiar with YAWL, you should know that that YAWL offers a service to integrate different users into a Workflow. This can be done by the so-called Resource Service. A not so well-known fact is that you can integrate software as a resource as well by writing a Codelet.  A Codelet is a java class that is directly embeddable as an executing resource in YAWL. This is a very mighty tool and offers many opportunities for different scenarios. Because of the scope of this topic I would advise you to read Chapter “4.12.1: Codelets” (Version 4.2) in the YAWL User Manual or to visit http://www.yaug.org/content/creating-and-using-codelets  for further information. 

Here I’ll give you a short explanation on how to integrate Codelets into YAWL. For giving YAWL the ability to find Codelets you have to change the „web.xml“ file which can be found in the YAWL installation directory „\...\YAWL-4.2\engine\apache-tomcat-7.0.65\webapps\resourceService\WEB-INF”. In this XML file you have to customize the element „<param-name>ExternalPluginsPath</param-name>“, because YAWL will use this XML element to start the search for Codelets in this given Codelet root directory. To give you an example I will show you my configuration in Windows which is built with the following Codelet root directory “C:\yawlPlugins” as follows - „<param-value>C:\yawlPlugins</param-value>“.  A not so obvious but mandatory aspect is that you must build a directory structure within the Codelet root directory according to the java package structure you have used for your Codelet. In this case, the Codelet is lying in the java package “codelet”. Because of this, you must create a directory “codelet” within the Codelet root directory. The attached Codelet must be inserted in the “codelet” directory, because otherwise YAWL could find the Codelet, but isn’t able to execute it.

The absolute path to my Codelet root directory is: C:\yawlPlugins

The path used within YAWL to find the GoogleSearchForYAWL.jar is the Codelet directory path + the java package structure + the Codelets file name -> C:\yawlPlugins\codelet\GoogleSearchForYAWL.jar

2.2 WebDriver

Because the framework Selenium WebDriver supports Java it is possible to write a Codelet in which the functionality of the WebDriver can be used directly. There are different WebDrivers for different web-browsers. For this tutorial we will use the so-called Geckodriver, with whom it is possible to automate the Firefox browser of Mozilla. An important step to get the whole approach working consists of telling the Codelet where the executable Geckodriver lies. For this step you can chose between three different options:

 

  1. 2.2.1 (Strongly Recommended) Tell the Codelet where the Geckodriver lies by a provided config file

 

There is a directory called “.GeckoConfigYAWL” with a XML file attached. You can change the given path in the XML by changing the text between the “GeckoDriverFilePath” elements. Take care that you refer directly on the executable file. This can be done independent of the operating system, just look up how your file is called and what the path of it is.

 

Windows example:

<?xml version="1.0"?>

<WebdriverForYAWLConfig>

  <GeckoDriverFilePath>C:\Users\Simon\Documents\geckodriver.exe</GeckoDriverFilePath>

  <RPAFilePath>

    <FilePath></FilePath>

  </RPAFilePath>

</WebdriverForYAWLConfig>

 

Linux example:

 

<?xml version="1.0"?>

<WebdriverForYAWLConfig>

  <GeckoDriverFilePath> home/Documents/driver/geckodriver </GeckoDriverFilePath>

  <RPAFilePath>

    <FilePath></FilePath>

  </RPAFilePath>

</WebdriverForYAWLConfig>

 

The directory with the changed XML configuration file should be in the user’s home directory, so that it is possible for the environment to find the directory at runtime.

In my Windows system the location of my folder is “C:\Users\Simon” while in my Linux system it is the “home/Simon” directory. If configured correctly your environment should have no problem to access the Geckodriver. If you followed carefully you may noticed there are more XML elements with empty values. They will be used in later tutorials to configure where RPA-scripts are lying.

 

  1. 2.2.2 Let the environment search for the Geckodriver in the systems path

 

You can use your systems path to access you gecko driver. For this you must make an entry at your system path file with a link to your Geckodriver file. All steps necessary for this can be found on following website of the Selenium developers https://selenium.dev/documentation/en/webdriver/driver_requirements/.

 

  1. 2.2.3 Tell the Codelet where the Geckodriver lies by a YAWL parameter

 

The last provided possibility to give the environment the information about the location of your Geckodriver consists of transferring the information as a string parameter directly in YAWL. As soon as you define a Task as an RPA task it automatically gets two parameters. One represents the text you are searching and the other represents the Path to you Geckodriver. You can either declare the path as a default value or transfer the path as a text at runtime. Both ways aren’t very comfortable because you need to define the path of the Geckodriver either for every RPA tasks or worse, for every RPA task in every execution. And how we like to say: don’t repeat yourself! So for getting started and “Hello Worlding” it is a pretty easy way, but it isn’t anything you want to consider for the big game.

Note that you can use all 3 options to be safe in the case of an error. The provided environment searches with every option after another until a Geckodriver file was found.

3 Organizational data

If you want to produce the same results as I  you should upload the organizational data as well. For this there is a file provided, too. It’s called “YAWLOrgDataExport”. This file should be uploaded in the YAWL control panel in tab OrgData.

4 Description

 

The workflow consists of two tasks. First there is a task to enter relevant data for the search and for the configuration, second there is an automated task which hosts the Codelet.

Now that you have set up everything you are ready to go for a first execution. For this upload a new case from the YAWL editor or out of the YAWL control panel. You can now login with the user “Ulfric User” or “Alfred Admin”. Their login is “uu” or “aa” and the password is “1111”. Now you can start the work item and insert a text to search for in google.  

If you used the config file or the systems path to identify the location of the Geckodriver you can ignore the lower field “Path of GeckoDriver”. As soon as you mark your work item as completed a new Firefox browser should open and execute a search in Google.

Note that the YAWL Control Center outputs some Error messages, which you can ignore for now. Second you should know you are not supposed to use you keyboard or mouse while execution the RPA script, because with this you abort the interaction and change to manual mode. Last you should keep in mind that the first execution needs some time (15-30 seconds). After this all further executions should terminate much faster (max 3 seconds).  After the execution you can just close the web browser.

5 Conclusion and outlook

In this tutorial you have seen how to set up a “hello world”-ish environment for automating interaction with google out of a workflow management system. In the next parts I will present to you how to write more complex RPA scripts and how to interact with a rich internet application.  Last I will present you how to use a provided Codelet to manage generic data transfer between the workflow management system and the RPA scripts as well as how to ensure exception management.