Difference between revisions of "Request Attachment Archiver"

From Hornbill
Jump to navigation Jump to search
Line 91: Line 91:
  
 
== API Key Rules ==
 
== API Key Rules ==
This utility uses ([[API Keys]]):
+
This utility uses ([[API keys]]):
  
 
* data:queryExec
 
* data:queryExec

Revision as of 15:11, 24 September 2021

About the Hornbill Request Attachment Archiver Utility

The utility provides a simple, safe and secure way to extract file attachments from the Hornbill platform. The tool connects to your Hornbill instance in the cloud over HTTPS/SSL, so as long as you have standard internet access then you should be able to use the tool without the need to make any firewall configuration changes.

This tool does two things:

  1. Attachments to requests which have not been updated for x amount of weeks (x > 12) will be wrapped together in a .zip file
  2. remove links to those attachments from the Hornbill instance


Information
Important: One of the optimisations within the Hornbill platform is that the same file (e.g. an image in a email footer) is only stored once. There is a counter which keeps track of how many times that file is used/referenced (within Service Manager). Only once the counter is zero (i.e. there is no request referencing that attachment), is the actual file removed. The "removal" in this utility only reduces the counter by one - if that happens to make the reference number zero, then it will have the subsequent effect of actual file removal. Within the affected requests, you WILL still see a reference to the file inteh "Attachments"-section (i.e. so you still have an overview of the files attached), BUT a download attempt will fail. At that point, you will need to refer to the backed up/archived .zip file - which is easily identified by the call reference.

Open Source

The Request Attachment Archiver Utility is provided open source under the Hornbill Community Licence and can be found here on GitHub

Installation Overview

Windows Installation

  • Download the ZIP archive relevant to your OS and architecture
  • Extract zip into a folder you would like the application to run from e.g. C:\HornbillRequestArchive\
  • Open conf.json and add in the necessary configuration
  • Open a Command Line Prompt as Administrator
  • Change Directory to the folder containing the import files C:\HornbillRequestArchive\
  • Run the command:

For Windows Systems: goRequestAttachmentArchiver.exe -cutoff=26 -dryrun=true -file=conf.json

For Mac OSX and Linux Systems: ./goRequestAttachmentArchiver -cutoff=26 -dryrun=true -file=conf.json

To run this on a schedule, you might want to consider the following sample usage which locates the files to a local folder named for the current date:

setlocal
set M=%date:~3,2%
set Y=%date:~6,4%
set D=%date:~0,2%
goRequestAttachmentArchiver.exe -cutoff=52 -dryrun=true -output=%Y%%M%%D%
endlocal

Configuration Overview

A demonstration configuration file is provided within the package. If a configuration file is not specified as a command line argument when executing the tool, then a default configuration file named conf.json, containing the correct JSON, must exist:

{
	"InstanceID": ""
	, "APIKeys": [
		""
	]
	, "AttachmentFolder": "C:/Temp/"
}

Config

  • "InstanceID" - the name of your Hornbill instance and can be found within the URL you use to navigate to it: live.hornbill.com/[instance name]/. E.g. if the URL you use to access your instance is live.hornbill.com/arescomputing/, then your instance id would be "arescomputing". This value is case sensitive.
  • "APIKeys" - an array of API Keys. Hornbill API key for a user account with the correct permissions to carry out all of the required API calls. Details on how to create an API key can be found here.
  • "AttachmentFolder" - The location where the files are going to be archived.
    • The format of the .zip file will be REQUESTID_2015-11-06T14-26-13Z.zip - each attachment that was found for that request will appear in the .zip file.

Command Line Parameters

  • file - Defaults to `conf.json` - Name of the Configuration file to load
  • dryrun - Defaults to `false` - Set to True and the code for the REMOVAL of the attachments will not be called, and instead the generated XML for each asset will be dumped to the log file. This is to aid in debugging the initial connection information.
  • output - Folder to store downloads in - overrides AttachmentFolder from the configuration file.
  • cutoff - Defaults to `12`. Set the cut off date in weeks (12 or greater) - requests which haven't been touched for longer than this amount of time will be picked up.
  • pagesize - Defaults to `100` - Default Query Size (how many results per page).
  • call - IF a specific Request ID is given, then that request will be archived.

Testing Overview

There is no substitute for hands-on experience when becoming familiar with the Hornbill import utilities.

goRequestAttachmentArchiver.exe -call=IN01234567 -cutoff=26 -dryrun=true -file=conf.json

This should create a IN01234567_2015-11-06T14-26-13Z.zip file in the AttachmentFolder configured in the .json. You should be able to open the file as usual and view compare the files in the .zip with those attached to the call.

Command Line Output

After each run of the utility, the command line will output a summary of the records that were processed.

This output can also be found in the log files which should be examined to understand why records failed to archive. In the case of a failed archive, even if this is only due to a problem with one of the attributes, then the attachments will NOT be purged from the request.

Information
Important: IF you are running the script for the first time, there is probably a lot of data to process.

It is recommended that you process this in a few steps.

For instance if you have 5 years (260 weeks) of accumulated requests, and wish to only remove the attachments of requests which have not been of updated for longer than a year (52 weeks):

Instead of running the script with a cutoff of 52 (which you would do regularly AFTER this first exercise), run the script with a cutoff of 250, and then reducing in manageable steps until you get to the 52 weeks (eg: 225, 200, ..., 52)

API Key Rules

This utility uses (API keys):

  • data:queryExec
  • data:entityAttachBrowse
  • data:entityAttachFile
  • data:entityAttachRemove
  • system:pingCheck

HTTP Proxies

If you use a proxy for all of your internet traffic, the HTTP_PROXY and HTTPS_PROXY Environment variables need to be set. These environment variables hold the hostname or IP address of your proxy server. It is a standard environment variable and like any such variable, the specific steps you use to set it depends on your operating system.

For windows machines, it can be set from the command line using the following:
set HTTP_PROXY=HOST:PORT

set HTTPS_PROXY=HOST:PORT
Where "HOST" is the IP address or host name of your Proxy Server and "PORT" is the specific port number. IF you require a username and password to go through the proxy, the format for the setting is as follows:
set HTTP_PROXY=username:password@HOST:PORT

set HTTPS_PROXY=username:password@HOST:PORT

URLs to White List

Occasionally on top of setting the HTTP_PROXY variable the following URLs need to be white listed to allow access out to our network

Troubleshooting

Logging Overview

All logging output is saved in the log directory, in the same directory as the executable. The file name contains the date and time the import was run RAA_2015-11-06T14-26-13Z.log

Common Error Messages

Below are some common errors that you may encounter in the log file and what they mean:

  • [ERROR] Error Decoding Configuration File:..... - this will be typically due to a missing quote (") or comma (,) somewhere in the configuration file. This is where an online JSON viewer/validator can come in handy rather than trawling the conf file looking for that proverbial needle in a haystack.
  • [ERROR] https:// ........invalid request :path "//xmlmc//apps/com.hornbill.servicemanager/?method=[methodName]" - If you identify errors stating an "invalid request path" for one or more API calls, this is typically due to a missing or incorrect instance name specified in the conf.json file. Check the instance id is correct. It also may be prudent to check you have added a valid API key too.

Error Codes

  • 100 - Unable to create log File
  • 101 - Unable to create log folder
  • 102 - Unable to Load Configuration File

Scheduling Overview

Windows

You can schedule goRequestAttachmentArchiver.exe to run with any optional command line argument from Windows Task Scheduler.

  • Ensure the user account running the task has rights to goRequestAttachmentArchiver.exe and the containing folder.
  • Make sure the Start In parameter contains the folder where goRequestAttachmentArchiver.exe resides in otherwise it will not be able to pick up the correct path.