Difference between revisions of "Develop dataspectsSystem"

From FindAndLearn::Cookbook
Jump to: navigation, search
(Ui)
(Documents (non-repository))
Line 168: Line 168:
 
=== Documents (non-repository) ===
 
=== Documents (non-repository) ===
  
https://github.com/dataspects/dsdocument-cli
+
https://github.com/dataspects/dsdocument-cli
 +
 
 +
{{#ask:[[C0992221051]]
 +
|mainlabel=Ingestion
 +
|?HasIngestionSource=Source
 +
|?HasIngestionTarget=Target
 +
|?HasIngestionFeeder=Feeder
 +
|format=template
 +
|template=AAAA-IngestionItem
 +
|link=none
 +
|named args=yes
 +
}}
  
 
== Relevance engineering (RELENG) ==
 
== Relevance engineering (RELENG) ==

Revision as of 14:18, 7 November 2019

Run the stack[edit | edit source]

Standard Services dataspects Services

Mongo[edit | edit source]

Elasticsearch[edit | edit source]

Kibana[edit | edit source]

ELASTICSEARCH_HOSTS=http://127.0.0.1:9200

Tika[edit | edit source]

Redis[edit | edit source]

Nats[edit | edit source]

Ui[edit | edit source]

PORT=3000
REDIS_HOST=<HOSTNAME>
REDIS_PORT=6379
SESSION_SECRET=
MONGODB_URI=mongodb://<HOSTNAME>:27017/<DB_NAME>
SMTP_SERVER=
SMTP_USERNAME=
SMTP_PASSWORD=
FROM_EMAIL=
FROM_NAME=
SITE_URL=https://ui.dataspects.com
OTP_DOMAIN=ui.dataspects.com
API_KEY=
ES_NODE=http://<HOSTNAME>:9200
DOCUMENT_DATA_STORE_API_URL=http://<HOSTNAME>:3003
DOCUMENT_DATA_STORE_API_MASTER_KEY=
APM_SERVER_URL=http://<HOSTNAME>:8200

Datastore[edit | edit source]

https://github.com/dataspects/datastore

go/src/github.com/dataspects/datastore$ go run main.go --c config.yml --p 3001 --d datastore.db
# config.yml
apikey: ""
nats:
  host: localhost
  port: 4222
tika:
  host: localhost
  port: 9998

Indexer[edit | edit source]

https://github.com/dataspects/indexer

go/src/github.com/dataspects/indexer$ go run main.go --c config.yml
# config.yml
elastic-search:
  host: <nowiki>http://localhost</nowiki>
  port: 9200
  username: elastic
  password: 
nats:
  host: <nowiki>http://localhost</nowiki>
  port: 4222
tika:
  host: <nowiki>http://localhost</nowiki>
  port: 9998
  username:
  password:

Mariadb[edit | edit source]

Apache[edit | edit source]

Parsoid[edit | edit source]

Ideally you run all services except UI, datastore and indexer by https://github.com/dataspects/dataspectsSystems.

Feeding data from ResourceSilos to indices[edit | edit source]

MediaWiki[edit | edit source]

PUSH: DataspectsMediaWikiFeeder[edit | edit source]

PULL: mediawiki-workbench[edit | edit source]

https://github.com/dataspects/mediawiki-workbench

Code Repository (non-git)[edit | edit source]

https://github.com/dataspects/dsrepository-cli

From Repository to Datastore by https://github.com/dataspects/dsrepository-cli

Data fed?
Configuration

Configure the datastore:

<ID>      automatic
<Label>   Shown in information sources list on https://ui.dataspects.com/search
<API Key> automatic
<Regex>   Only file names matching this regex will be fed to the datastore (Regex Tester - Golang)

Configure the feeder:

user@workstation:/yourrepo$ ./dsrepository-cli \
                              --id  <ID>      # From https://ui.dataspects.com/datastores/repositories/code\
                              --url https://datastore.dataspects.com/repositories/code \
                              --key <API Key> # From https://ui.dataspects.com/datastores/repositories/code
Subsequent pipeline

Items fed to the datastore will be fed on to an Indexer.

Currently the CodeIndexer doesn't index the actual file but scans it for (?i)error *(\d{4}:\d{2})(.*), creates an ErrorCode entity named \d{4}:\d{2} with an annotation "OccursInFile".

Tools


Git Repository[edit | edit source]

Documents (non-repository)[edit | edit source]

https://github.com/dataspects/dsdocument-cli

From File system to Datastore by https://github.com/dataspects/dsdocument-cli

Data fed?
Configuration

Configure the datastore:

<Datastore ID>        automatic (e.g. 12)
<Datastore Label>     Shown in information sources list on https://ui.dataspects.com/search
<Datastore API Key>   automatic (e.g. c8b89bc3-0139-11wa-8ef3-8c164563716b)
<Datastore Doc Regex> Only file names matching this regex will be fed to the datastore (Regex Tester - Golang)

Configure and run the feeder to index matching files in and below the current folder:

user@workstation:/yourfolder$ ./dsdocument-cli \
                              --id  <Datastore ID>      # From https://ui.dataspects.com/datastores/files \
                              --url https://datastore.dataspects.com \
                              --key <Datastore API Key> # From https://ui.dataspects.com/datastores/files
Subsequent pipeline
Tools


Relevance engineering (RELENG)[edit | edit source]

ui/relevanceEngineering

UI design (results display and interaction design (faceting, drilldown))[edit | edit source]

  • Helper: uncomment link(rel='stylesheet', href='/css/dataspects-meta.css') in dataspects-ui/views/layout.pug
  • RequestTypes: mainRequest, predicateRequest, entityTypeRequest, actionRequest