Difference between revisions of "Develop dataspectsSystem"

From FindAndLearn::Cookbook
Jump to: navigation, search
(Feeding data from ResourceSilos to indices)
(Code Repository (non-git))
 
(6 intermediate revisions by the same user not shown)
Line 2: Line 2:
  
 
{{#mermaid:
 
{{#mermaid:
graph TD
+
graph LR
mongo
+
 
 
elasticsearch
 
elasticsearch
 
kibana["<b>Kibana</b>"]
 
kibana["<b>Kibana</b>"]
 
tika
 
tika
redis
 
nats
 
 
ui["<b>UI</b>"]
 
ui["<b>UI</b>"]
datastore
+
dataspectsd["<b>dataspectsd</b>"]
indexer
 
mariadb
 
subgraph MediaWiki
 
apache
 
dataspectsmediawikifeeder["<b>DataspectsMediaWikiFeeder</b>"]
 
end
 
parsoid
 
dsrepositorycli["<b>dsrepository-cli</b>"]
 
 
dataspecter["<b>dataspecter</b>"]
 
dataspecter["<b>dataspecter</b>"]
dsdocumentcli["<b>dsdocument-cli</b>"]
 
 
ui-->mongo
 
ui-->redis
 
ui-->elasticsearch
 
ui-->tika
 
ui-->|manage|datastore
 
  
apache-->mariadb
+
ui---dataspectsd
apache-->parsoid
 
  
kibana-->elasticsearch
+
kibana---elasticsearch
 +
dataspectsd---elasticsearch
  
datastore-->|publish|nats
+
dataspectsd---tika
datastore-->tika
+
dataspecter---tika
dataspectsmediawikifeeder-->|write|datastore
+
dataspecter---Folder
 +
dataspecter---Repository
 +
MediaWiki-->|DataspectsMediaWikiFeeder|dataspectsd
  
indexer-->|subscribe|nats
+
dataspecter---MediaWiki
indexer-->tika
+
dataspecter---dataspectsd
indexer-->elasticsearch
 
  
dsrepositorycli-->|write|datastore
+
classDef ResourceSilo fill:lightgreen
 
 
dataspecter-->|read/write|apache
 
dataspecter-->|write|datastore
 
 
 
dsdocumentcli-->|write|datastore
 
 
 
classDef indexer fill:lightgreen
 
class indexer indexer
 
  
 
classDef gui fill:lightblue
 
classDef gui fill:lightblue
class kibana,ui,dsrepositorycli,dsdocumentcli,dataspecter,dataspectsmediawikifeeder gui
+
class kibana,ui,dataspecter,dataspectsmediawikifeeder gui
 +
class Repository,Folder,MediaWiki ResourceSilo
 
}}
 
}}
  
Line 60: Line 37:
 
|-
 
|-
 
|width=50%|
 
|width=50%|
=== Mongo ===
 
 
=== Elasticsearch ===
 
=== Elasticsearch ===
 
=== Kibana ===
 
=== Kibana ===
Line 67: Line 43:
 
</syntaxhighlight>
 
</syntaxhighlight>
 
=== Tika ===
 
=== Tika ===
=== Redis ===
 
=== Nats ===
 
 
|
 
|
=== Ui ===
+
=== UI ===
 
* https://github.com/dataspects/dataspects-ui
 
* https://github.com/dataspects/dataspects-ui
 
* [[C0792223532]]
 
* [[C0792223532]]
Line 93: Line 67:
 
</syntaxhighlight>
 
</syntaxhighlight>
  
=== Datastore ===
+
=== dataspectsd ===
https://github.com/dataspects/datastore
+
https://github.com/dataspects/dataspectsd
  go/src/github.com/dataspects/datastore$ go run main.go --c config.yml --p 3001 --d datastore.db
+
  go/src/github.com/dataspects/dataspectsd$ go run main.go --c config.yml --p 3001 --d dataspectsd.db
 
 
<syntaxhighlight lang=yaml>
 
# config.yml
 
apikey: ""
 
nats:
 
  host: localhost
 
  port: 4222
 
tika:
 
  host: localhost
 
  port: 9998
 
</syntaxhighlight>
 
 
 
=== Indexer ===
 
https://github.com/dataspects/indexer
 
go/src/github.com/dataspects/indexer$ go run main.go --c config.yml
 
  
 
<syntaxhighlight lang=yaml>
 
<syntaxhighlight lang=yaml>
 
# config.yml
 
# config.yml
 +
apikey: "ping"
 
elastic-search:
 
elastic-search:
 
   host: http://localhost
 
   host: http://localhost
 
   port: 9200
 
   port: 9200
   username: elastic
+
   username:
   password:  
+
   password:
nats:
 
  host: http://localhost
 
  port: 4222
 
 
tika:
 
tika:
   host: http://localhost
+
   host: localhost
 
   port: 9998
 
   port: 9998
  username:
 
  password:
 
 
</syntaxhighlight>
 
</syntaxhighlight>
  
|-
 
|
 
 
=== Mariadb ===
 
|
 
=== Apache ===
 
=== Parsoid ===
 
 
|}
 
|}
  
Ideally you run all services except UI, datastore and indexer by https://github.com/dataspects/dataspectsSystems.
+
Ideally you run all services except UI, dataspectsd and dataspecter by https://github.com/dataspects/dataspectsSystems.
  
 
== Feeding data from ResourceSilos to indices ==
 
== Feeding data from ResourceSilos to indices ==
Line 144: Line 92:
 
{{#mermaid:
 
{{#mermaid:
 
graph LR
 
graph LR
DSR[<b>Dataspecter</b>]
+
DSR[<b>dataspecter</b>]
 
RS[<b>ResourceSilo</b>]
 
RS[<b>ResourceSilo</b>]
DS[<b>Datastore</b><br>resource-silo-specific/<br>domain-agnostic SIGINT]
+
DD[<b>dataspectsd</b><br>resource-silo-specific/<br>domain-agnostic SIGINT]
subgraph domain-specific SIGINT/QUALAS/RELENG
 
IN[<b>Indexer</b>]
 
VA[<b>Validation</b>]
 
IN-.-VA
 
end
 
 
ES[<b>Elasticsearch</b>]
 
ES[<b>Elasticsearch</b>]
RS-->|native feed|DS
+
RS-->|native feed|DD
IN-->ES
+
DD-->ES
DS-->|index|IN
+
DSR-->|external feed|DD
DSR-->|external feed|DS
 
 
GUI[<b>User Experience</b><br>domain-specific]
 
GUI[<b>User Experience</b><br>domain-specific]
 
ES-->GUI
 
ES-->GUI
Line 175: Line 117:
  
 
=== Code Repository (non-git) ===
 
=== Code Repository (non-git) ===
 
https://github.com/dataspects/dsrepository-cli
 
  
 
{{#ask:[[C0334941560]]
 
{{#ask:[[C0334941560]]

Latest revision as of 06:42, 2 December 2019

Run the stack[edit | edit source]

Standard Services dataspects Services

Elasticsearch[edit | edit source]

Kibana[edit | edit source]

ELASTICSEARCH_HOSTS=http://127.0.0.1:9200

Tika[edit | edit source]

UI[edit | edit source]

PORT=3000
REDIS_HOST=<HOSTNAME>
REDIS_PORT=6379
SESSION_SECRET=
MONGODB_URI=mongodb://<HOSTNAME>:27017/<DB_NAME>
SMTP_SERVER=
SMTP_USERNAME=
SMTP_PASSWORD=
FROM_EMAIL=
FROM_NAME=
SITE_URL=https://ui.dataspects.com
OTP_DOMAIN=ui.dataspects.com
API_KEY=
ES_NODE=http://<HOSTNAME>:9200
DOCUMENT_DATA_STORE_API_URL=http://<HOSTNAME>:3003
DOCUMENT_DATA_STORE_API_MASTER_KEY=
APM_SERVER_URL=http://<HOSTNAME>:8200

dataspectsd[edit | edit source]

https://github.com/dataspects/dataspectsd

go/src/github.com/dataspects/dataspectsd$ go run main.go --c config.yml --p 3001 --d dataspectsd.db
# config.yml
apikey: "ping"
elastic-search:
  host: http://localhost
  port: 9200
  username:
  password:
tika:
  host: localhost
  port: 9998

Ideally you run all services except UI, dataspectsd and dataspecter by https://github.com/dataspects/dataspectsSystems.

Feeding data from ResourceSilos to indices[edit | edit source]

MediaWiki[edit | edit source]

PUSH: DataspectsMediaWikiFeeder[edit | edit source]

PULL: dataspecter[edit | edit source]

https://github.com/dataspects/dataspecter

Code Repository (non-git)[edit | edit source]

From Repository to dataspectsd by

Data fed?
Configuration

Configure the datastore:

<ID>        automatic
<Label>     Shown in information sources list on https://ui.dataspects.com/search
<ClientKey> automatic
<Regex>     Only file names matching this regex will be fed to the datastore (Regex Tester - Golang)

Configure the feeder:

[email protected]:/yourrepo$ ./dataspecter \
                              --id  <ID>      # From https://ui.dataspects.com/datastores/repositories/code\
                              --url https://datastore.dataspects.com \
                              --key <API Key> # From https://ui.dataspects.com/datastores/repositories/code
Subsequent pipeline
Tools


Git Repository[edit | edit source]

Documents (non-repository)[edit | edit source]

https://github.com/dataspects/dsdocument-cli

From File system to Datastore by https://github.com/dataspects/dsdocument-cli

Data fed?
Configuration

Configure the datastore:

<Datastore ID>        automatic (e.g. 12)
<Datastore Label>     Shown in information sources list on https://ui.dataspects.com/search
<Datastore API Key>   automatic (e.g. c8b89bc3-0139-11wa-8ef3-8c164563716b)
<Datastore Doc Regex> Only file names matching this regex will be fed to the datastore (Regex Tester - Golang)

Configure and run the feeder to index matching files in and below the current folder:

[email protected]:/yourfolder$ ./dsdocument-cli \
                              --id  <Datastore ID>      # From https://ui.dataspects.com/datastores/files \
                              --url https://datastore.dataspects.com \
                              --key <Datastore API Key> # From https://ui.dataspects.com/datastores/files
Subsequent pipeline
Tools


Signals Intelligence (SIGINT)[edit | edit source]

Relevance Engineering (RELENG)[edit | edit source]

ui/relevanceEngineering

Quality Assurance (QA)[edit | edit source]

UI design (results display and interaction design (faceting, drilldown))[edit | edit source]

  • Helper: uncomment link(rel='stylesheet', href='/css/dataspects-meta.css') in dataspects-ui/views/layout.pug
  • RequestTypes: mainRequest, predicateRequest, entityTypeRequest, actionRequest